-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add perf dashboard #159
Add perf dashboard #159
Conversation
| parallel_connections | number of parallel connections to use | | ||
| upload_bytes | number of bytes to upload to server per connection | | ||
| download_bytes | number of bytes to download from server per connection | | ||
| n_times | number of times to do the perf round trip | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the number of times one executes a libp2p perf round trip (open stream, sent # bytes to receive, sent bytes, receive bytes) per connection? Or is this the number of times one should execute the whole benchmark?
In case of the former, we still need a parameter on how often the binary should execute the whole test, otherwise it can not emit aggregates like avg, min, max or p95.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this the number of times one should execute the whole benchmark
Yes. I was thinking this would be shared amongst all connections. So you could say have n_times = 100, and parallel_connections = 10 and have an average of 10 roundtrips per conn, but some may have more or less in practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are all the params the msquic tool accepts: https://github.com/microsoft/msquic/tree/main/src/perf
See libp2p/test-plans#159 for schema discussion.
Status update: Latest iteration of rust-libp2p perf client now prints the following JSON: {
"benchmarks": [
{
"name": "Single Connection throughput – Upload",
"result": "1270317.353942249",
"unit": "bits/s"
},
{
"name": "Single Connection 1 byte round trip latency 100 runs 95th percentile",
"result": "0.138944185",
"unit": "s"
}
]
} |
Sorry for the confusion in having two schema files. The correct one is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this PR is really useful for defining the end state and making sure we agree on the interface.
It was clear from me in the schema's where we define the "stat" (e.g., avg/p0/p95/p100).
"unit": "bits/s", | ||
"comparisons": [ | ||
{ | ||
"name": "http", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume https instead of http?
Also, do we want to be more clear than just "https"? Should we specify whether this is using go stdlibrary, rust stdlibrary, curl, etc? (and if so, I assume the version of go, rust, curl, etc. matters). At which point is a "comparison" very different than a result?
"$schema": "./benchmarks.schema.json", | ||
"benchmarks": [ | ||
{ | ||
"name": "Single Connection throughput – Upload", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do I see n, p0, p95,p100, and avg ?
|
||
The binary should report the following: | ||
|
||
* Stats on the length of the run: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Stats on the length of the run: | |
* Stats on the durations of the `n_times` runs: |
I assume that is correct (and I think makes it more clear)
The binary should report the following: | ||
|
||
* Stats on the length of the run: | ||
* Total, Avg, Min, Max, p95 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than special-casing Min / Max, just do p0 and p100 ?
|
||
* Stats on the length of the run: | ||
* Total, Avg, Min, Max, p95 | ||
* Round trips per second: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to follow what this means.
I think we probably needs some vocabulary here.
I'm open to different terms, but I'm assuming we have a "test run" which will do "n_times" round trips.
I assume this means that at the end of the "test run" we eimit:
avg number of round trips per second that were completed
p0 number of round trips per second
p100 number of round trips per second
p95 number of round trips per second
etc.
This means that if the benchmark runner does a "test run" that lasts for ~10 seconds, it will be tracking for each of those seconds the number of round trips that complete each second so that it can then compute the avg/p0/p95/p100 stats.
Is that right?
Also, I assume the benchmark doesn't care how many concurrent connections there are. It just tracks how many round trips complete each second so it can compute the stats.
I am closing here in favor of #163 which is further along. Let's continue any discussions there. |
Opening your branch @MarcoPolo as a draft pull request in order to comment.