Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add perf dashboard #159

Closed
wants to merge 8 commits into from
Closed

Add perf dashboard #159

wants to merge 8 commits into from

Conversation

mxinden
Copy link
Member

@mxinden mxinden commented Mar 24, 2023

Opening your branch @MarcoPolo as a draft pull request in order to comment.

| parallel_connections | number of parallel connections to use |
| upload_bytes | number of bytes to upload to server per connection |
| download_bytes | number of bytes to download from server per connection |
| n_times | number of times to do the perf round trip |
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the number of times one executes a libp2p perf round trip (open stream, sent # bytes to receive, sent bytes, receive bytes) per connection? Or is this the number of times one should execute the whole benchmark?

In case of the former, we still need a parameter on how often the binary should execute the whole test, otherwise it can not emit aggregates like avg, min, max or p95.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this the number of times one should execute the whole benchmark

Yes. I was thinking this would be shared amongst all connections. So you could say have n_times = 100, and parallel_connections = 10 and have an average of 10 roundtrips per conn, but some may have more or less in practice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are all the params the msquic tool accepts: https://github.com/microsoft/msquic/tree/main/src/perf

mxinden added a commit to mxinden/rust-libp2p that referenced this pull request Mar 24, 2023
@mxinden
Copy link
Member Author

mxinden commented Mar 24, 2023

Status update: Latest iteration of rust-libp2p perf client now prints the following JSON:

{
  "benchmarks": [
    {
      "name": "Single Connection throughput – Upload",
      "result": "1270317.353942249",
      "unit": "bits/s"
    },
    {
      "name": "Single Connection 1 byte round trip latency 100 runs 95th percentile",
      "result": "0.138944185",
      "unit": "s"
    }
  ]
}

@MarcoPolo
Copy link
Contributor

Sorry for the confusion in having two schema files. The correct one is perf-dashboard/benchmarks.schema.json. The output above is for the wrong schema.

Copy link
Contributor

@BigLep BigLep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR is really useful for defining the end state and making sure we agree on the interface.

It was clear from me in the schema's where we define the "stat" (e.g., avg/p0/p95/p100).

"unit": "bits/s",
"comparisons": [
{
"name": "http",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume https instead of http?

Also, do we want to be more clear than just "https"? Should we specify whether this is using go stdlibrary, rust stdlibrary, curl, etc? (and if so, I assume the version of go, rust, curl, etc. matters). At which point is a "comparison" very different than a result?

"$schema": "./benchmarks.schema.json",
"benchmarks": [
{
"name": "Single Connection throughput – Upload",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do I see n, p0, p95,p100, and avg ?


The binary should report the following:

* Stats on the length of the run:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Stats on the length of the run:
* Stats on the durations of the `n_times` runs:

I assume that is correct (and I think makes it more clear)

The binary should report the following:

* Stats on the length of the run:
* Total, Avg, Min, Max, p95
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than special-casing Min / Max, just do p0 and p100 ?


* Stats on the length of the run:
* Total, Avg, Min, Max, p95
* Round trips per second:
Copy link
Contributor

@BigLep BigLep Mar 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to follow what this means.

I think we probably needs some vocabulary here.
I'm open to different terms, but I'm assuming we have a "test run" which will do "n_times" round trips.

I assume this means that at the end of the "test run" we eimit:
avg number of round trips per second that were completed
p0 number of round trips per second
p100 number of round trips per second
p95 number of round trips per second
etc.

This means that if the benchmark runner does a "test run" that lasts for ~10 seconds, it will be tracking for each of those seconds the number of round trips that complete each second so that it can then compute the avg/p0/p95/p100 stats.

Is that right?

Also, I assume the benchmark doesn't care how many concurrent connections there are. It just tracks how many round trips complete each second so it can compute the stats.

@mxinden
Copy link
Member Author

mxinden commented Apr 24, 2023

I am closing here in favor of #163 which is further along. Let's continue any discussions there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants