-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add perf dashboard #159
Add perf dashboard #159
Changes from 7 commits
a26aada
ac3bba3
dcd44bf
cd66695
b75cb3b
343b2e5
3e45f54
8d94c83
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,62 @@ | ||||||
# Perf Dashboard [wip] | ||||||
|
||||||
This is an outline of what I'd like to see our perf dashboard look like. | ||||||
|
||||||
A lot of this is inspired by the [MsQuic | ||||||
dashboard](https://microsoft.github.io/msquic/). Please look at that first. | ||||||
|
||||||
For each combination of libp2p implementation, version, and transport we would | ||||||
have numbers that outline: | ||||||
1. Download/Upload throughput | ||||||
2. Request latency | ||||||
3. Requests per second (for some request/response protocol) | ||||||
4. Handshakes per second (useful to identify overhead in connection | ||||||
initialization). | ||||||
5. Memory usage. | ||||||
|
||||||
The y axis on the graphs is the value for the above tests, the x axis is the | ||||||
specific version. Different lines represent different implementation+transports. | ||||||
|
||||||
The dashboards should be selectable and filterable. | ||||||
|
||||||
# Other transports (iperf/http) | ||||||
|
||||||
We have to be careful to compare apples to apples. A raw iperf number might be | ||||||
confusing here because no application will ever hit those numbers since they | ||||||
will at least want some encryption in their connection. I would suggest not | ||||||
having this or an HTTP comparison. Having HTTPS might be okay. | ||||||
|
||||||
# Example dashboard | ||||||
|
||||||
https://observablehq.com/@realmarcopolo/libp2p-perf | ||||||
|
||||||
The dashboard automatically pulls data from this repo to display it. | ||||||
|
||||||
It currently pulls example-data.json. The schema of this data is defined in | ||||||
`benchmarks.schema.json` and `benchmark-result-type.ts`. | ||||||
|
||||||
# Benchmark runner | ||||||
|
||||||
This is the thing that runs the benchmark binaries and produces the benchmark | ||||||
data (that matches the benchmarks.schema.json/benchmark-result-type.ts) | ||||||
|
||||||
# Benchmark binary | ||||||
|
||||||
This is per implementation. It's a binary that accepts the following flags: | ||||||
|
||||||
| Flag | description | | ||||||
| -------------------- | ------------------------------------------------------ | | ||||||
| parallel_connections | number of parallel connections to use | | ||||||
| upload_bytes | number of bytes to upload to server per connection | | ||||||
| download_bytes | number of bytes to download from server per connection | | ||||||
| n_times | number of times to do the perf round trip | | ||||||
| close_connection | bool. Close the connection after a perf round trip | | ||||||
|
||||||
The binary should report the following: | ||||||
|
||||||
* Stats on the length of the run: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
I assume that is correct (and I think makes it more clear) |
||||||
* Total, Avg, Min, Max, p95 | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Rather than special-casing Min / Max, just do p0 and p100 ? |
||||||
* Round trips per second: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm trying to follow what this means. I think we probably needs some vocabulary here. I assume this means that at the end of the "test run" we eimit: This means that if the benchmark runner does a "test run" that lasts for ~10 seconds, it will be tracking for each of those seconds the number of round trips that complete each second so that it can then compute the avg/p0/p95/p100 stats. Is that right? Also, I assume the benchmark doesn't care how many concurrent connections there are. It just tracks how many round trips complete each second so it can compute the stats. |
||||||
* Avg, Min, Max, p95 | ||||||
|
||||||
Maybe more? TODO... |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
{ | ||
"$schema": "https://json-schema.org/draft/2020-12/schema", | ||
"$id": "https://libp2p.io/benchmark.schema.json", | ||
"title": "Benchmark Results", | ||
"description": "Results from a benchmark run", | ||
"type": "object", | ||
"properties": { | ||
"benchmarks": { | ||
"description": "A list of benchmark results", | ||
"type": "array", | ||
"items": { | ||
"type": "object", | ||
"properties": { | ||
"name": { | ||
"description": "The name of this benchmark", | ||
"type": "string" | ||
}, | ||
"unit": { | ||
"description": "The unit for the result", | ||
"enum": [ | ||
"bits/s", | ||
"s" | ||
] | ||
}, | ||
"result": { | ||
"description": "String encoded result. Parse as float64", | ||
"type": "string" | ||
} | ||
}, | ||
"required": [ | ||
"name", | ||
"unit", | ||
"result", | ||
"implementation", | ||
"stack", | ||
"version" | ||
MarcoPolo marked this conversation as resolved.
Show resolved
Hide resolved
|
||
] | ||
} | ||
} | ||
}, | ||
"required": [ | ||
"benchmarks", | ||
"productName" | ||
] | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
export type BenchmarkResults = { | ||
benchmarks: Benchmark[], | ||
// For referencing this schema in JSON | ||
"$schema"?: string | ||
}; | ||
|
||
export type Benchmark = { | ||
name: string, | ||
unit: "bits/s" | "s", | ||
results: Result[], | ||
comparisons: Comparison[], | ||
|
||
} | ||
|
||
export type Result = { | ||
result: number, | ||
implementation: string, | ||
transportStack: string, | ||
version: string | ||
}; | ||
|
||
export type Comparison = { | ||
name: string, | ||
result: number, | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,98 @@ | ||
{ | ||
"$schema": "http://json-schema.org/draft-07/schema#", | ||
"$ref": "#/definitions/BenchmarkResults", | ||
"definitions": { | ||
"BenchmarkResults": { | ||
"type": "object", | ||
"properties": { | ||
"benchmarks": { | ||
"type": "array", | ||
"items": { | ||
"$ref": "#/definitions/Benchmark" | ||
} | ||
}, | ||
"$schema": { | ||
"type": "string" | ||
} | ||
}, | ||
"required": [ | ||
"benchmarks" | ||
], | ||
"additionalProperties": false | ||
}, | ||
"Benchmark": { | ||
"type": "object", | ||
"properties": { | ||
"name": { | ||
"type": "string" | ||
}, | ||
"unit": { | ||
"type": "string", | ||
"enum": [ | ||
"bits/s", | ||
"s" | ||
] | ||
}, | ||
"results": { | ||
"type": "array", | ||
"items": { | ||
"$ref": "#/definitions/Result" | ||
} | ||
}, | ||
"comparisons": { | ||
"type": "array", | ||
"items": { | ||
"$ref": "#/definitions/Comparison" | ||
} | ||
} | ||
}, | ||
"required": [ | ||
"name", | ||
"unit", | ||
"results", | ||
"comparisons" | ||
], | ||
"additionalProperties": false | ||
}, | ||
"Result": { | ||
"type": "object", | ||
"properties": { | ||
"result": { | ||
"type": "number" | ||
}, | ||
"implementation": { | ||
"type": "string" | ||
}, | ||
"transportStack": { | ||
"type": "string" | ||
}, | ||
"version": { | ||
"type": "string" | ||
} | ||
}, | ||
"required": [ | ||
"result", | ||
"implementation", | ||
"transportStack", | ||
"version" | ||
], | ||
"additionalProperties": false | ||
}, | ||
"Comparison": { | ||
"type": "object", | ||
"properties": { | ||
"name": { | ||
"type": "string" | ||
}, | ||
"result": { | ||
"type": "number" | ||
} | ||
}, | ||
"required": [ | ||
"name", | ||
"result" | ||
], | ||
"additionalProperties": false | ||
} | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,161 @@ | ||
{ | ||
"$schema": "./benchmarks.schema.json", | ||
"benchmarks": [ | ||
{ | ||
"name": "Single Connection throughput – Upload", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Where do I see n, p0, p95,p100, and avg ? |
||
"unit": "bits/s", | ||
"comparisons": [ | ||
{ | ||
"name": "http", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I assume https instead of http? Also, do we want to be more clear than just "https"? Should we specify whether this is using go stdlibrary, rust stdlibrary, curl, etc? (and if so, I assume the version of go, rust, curl, etc. matters). At which point is a "comparison" very different than a result? |
||
"result": 1234 | ||
} | ||
], | ||
"results": [ | ||
{ | ||
"result": 1100, | ||
"implementation": "go-libp2p", | ||
"transportStack": "quic-v1", | ||
"version": "v0.24.2" | ||
}, | ||
{ | ||
"result": 1000, | ||
"implementation": "go-libp2p", | ||
"transportStack": "quic-v1", | ||
"version": "v0.25.1" | ||
}, | ||
{ | ||
"result": 8234, | ||
"implementation": "go-libp2p", | ||
"transportStack": "quic-v1", | ||
"version": "v0.26.2" | ||
}, | ||
{ | ||
"result": 4000, | ||
"implementation": "rust-libp2p", | ||
"transportStack": "quic-v1", | ||
"version": "v0.50.0" | ||
}, | ||
{ | ||
"result": 8000, | ||
"implementation": "rust-libp2p", | ||
"transportStack": "quic-v1", | ||
"version": "v0.51.0" | ||
}, | ||
{ | ||
"result": 6000, | ||
"implementation": "rust-libp2p", | ||
"transportStack": "quic-v1", | ||
"version": "v0.52.0" | ||
}, | ||
{ | ||
"result": 9001, | ||
"implementation": "zig-libp2p", | ||
"transportStack": "quic-v1", | ||
"version": "v0.0.1" | ||
}, | ||
{ | ||
"result": 9002, | ||
"implementation": "zig-libp2p", | ||
"transportStack": "quic-v1", | ||
"version": "v0.0.2" | ||
}, | ||
{ | ||
"result": 201, | ||
"implementation": "js-libp2p", | ||
"transportStack": "tcp+noise+yamux", | ||
"version": "v0.41.0" | ||
}, | ||
{ | ||
"result": 302, | ||
"implementation": "js-libp2p", | ||
"transportStack": "tcp+noise+yamux", | ||
"version": "v0.42.0" | ||
}, | ||
{ | ||
"result": 501, | ||
"implementation": "js-libp2p", | ||
"transportStack": "tcp+noise+yamux", | ||
"version": "v0.43.0" | ||
} | ||
] | ||
}, | ||
{ | ||
"name": "Single Connection 1 byte round trip latency", | ||
"unit": "s", | ||
"comparisons": [ | ||
{ | ||
"name": "http", | ||
"result": 1.234 | ||
} | ||
], | ||
"results": [ | ||
{ | ||
"result": 0.100, | ||
"implementation": "go-libp2p", | ||
"transportStack": "quic-v1", | ||
"version": "v0.24.2" | ||
}, | ||
{ | ||
"result": 0.100, | ||
"implementation": "go-libp2p", | ||
"transportStack": "quic-v1", | ||
"version": "v0.25.1" | ||
}, | ||
{ | ||
"result": 0.834, | ||
"implementation": "go-libp2p", | ||
"transportStack": "quic-v1", | ||
"version": "v0.26.2" | ||
}, | ||
{ | ||
"result": 0.400, | ||
"implementation": "rust-libp2p", | ||
"transportStack": "quic-v1", | ||
"version": "v0.50.0" | ||
}, | ||
{ | ||
"result": 0.800, | ||
"implementation": "rust-libp2p", | ||
"transportStack": "quic-v1", | ||
"version": "v0.51.0" | ||
}, | ||
{ | ||
"result": 0.600, | ||
"implementation": "rust-libp2p", | ||
"transportStack": "quic-v1", | ||
"version": "v0.52.0" | ||
}, | ||
{ | ||
"result": 0.901, | ||
"implementation": "zig-libp2p", | ||
"transportStack": "quic-v1", | ||
"version": "v0.0.1" | ||
}, | ||
{ | ||
"result": 0.902, | ||
"implementation": "zig-libp2p", | ||
"transportStack": "quic-v1", | ||
"version": "v0.0.2" | ||
}, | ||
{ | ||
"result": 0.201, | ||
"implementation": "js-libp2p", | ||
"transportStack": "tcp+noise+yamux", | ||
"version": "v0.41.0" | ||
}, | ||
{ | ||
"result": 0.302, | ||
"implementation": "js-libp2p", | ||
"transportStack": "tcp+noise+yamux", | ||
"version": "v0.42.0" | ||
}, | ||
{ | ||
"result": 0.501, | ||
"implementation": "js-libp2p", | ||
"transportStack": "tcp+noise+yamux", | ||
"version": "v0.43.0" | ||
} | ||
] | ||
} | ||
] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the number of times one executes a libp2p perf round trip (open stream, sent # bytes to receive, sent bytes, receive bytes) per connection? Or is this the number of times one should execute the whole benchmark?
In case of the former, we still need a parameter on how often the binary should execute the whole test, otherwise it can not emit aggregates like avg, min, max or p95.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I was thinking this would be shared amongst all connections. So you could say have n_times = 100, and parallel_connections = 10 and have an average of 10 roundtrips per conn, but some may have more or less in practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are all the params the msquic tool accepts: https://github.com/microsoft/msquic/tree/main/src/perf