Add perf dashboard #159

mxinden · 2023-03-24T08:51:06Z

Opening your branch @MarcoPolo as a draft pull request in order to comment.

mxinden · 2023-03-24T08:52:15Z

perf-dashboard/README.md

+| parallel_connections | number of parallel connections to use                  |
+| upload_bytes         | number of bytes to upload to server per connection     |
+| download_bytes       | number of bytes to download from server per connection |
+| n_times              | number of times to do the perf round trip              |


Is this the number of times one executes a libp2p perf round trip (open stream, sent # bytes to receive, sent bytes, receive bytes) per connection? Or is this the number of times one should execute the whole benchmark?

In case of the former, we still need a parameter on how often the binary should execute the whole test, otherwise it can not emit aggregates like avg, min, max or p95.

this the number of times one should execute the whole benchmark

Yes. I was thinking this would be shared amongst all connections. So you could say have n_times = 100, and parallel_connections = 10 and have an average of 10 roundtrips per conn, but some may have more or less in practice.

Here are all the params the msquic tool accepts: https://github.com/microsoft/msquic/tree/main/src/perf

perf-dashboard/benchmark-data-schema.json

See libp2p/test-plans#159 for schema discussion.

mxinden · 2023-03-24T13:06:14Z

Status update: Latest iteration of rust-libp2p perf client now prints the following JSON:

{
  "benchmarks": [
    {
      "name": "Single Connection throughput – Upload",
      "result": "1270317.353942249",
      "unit": "bits/s"
    },
    {
      "name": "Single Connection 1 byte round trip latency 100 runs 95th percentile",
      "result": "0.138944185",
      "unit": "s"
    }
  ]
}

MarcoPolo · 2023-03-24T19:56:15Z

Sorry for the confusion in having two schema files. The correct one is perf-dashboard/benchmarks.schema.json. The output above is for the wrong schema.

BigLep

I think this PR is really useful for defining the end state and making sure we agree on the interface.

It was clear from me in the schema's where we define the "stat" (e.g., avg/p0/p95/p100).

BigLep · 2023-03-25T20:19:44Z

perf-dashboard/example-data.json

+            "unit": "bits/s",
+            "comparisons": [
+                {
+                    "name": "http",


I assume https instead of http?

Also, do we want to be more clear than just "https"? Should we specify whether this is using go stdlibrary, rust stdlibrary, curl, etc? (and if so, I assume the version of go, rust, curl, etc. matters). At which point is a "comparison" very different than a result?

BigLep · 2023-03-25T20:22:08Z

perf-dashboard/example-data.json

+    "$schema": "./benchmarks.schema.json",
+    "benchmarks": [
+        {
+            "name": "Single Connection throughput – Upload",


Where do I see n, p0, p95,p100, and avg ?

BigLep · 2023-03-25T20:26:13Z

perf-dashboard/README.md

+
+The binary should report the following:
+
+* Stats on the length of the run:


Suggested change

* Stats on the length of the run:

* Stats on the durations of the `n_times` runs:

I assume that is correct (and I think makes it more clear)

BigLep · 2023-03-25T20:26:53Z

perf-dashboard/README.md

+The binary should report the following:
+
+* Stats on the length of the run:
+  * Total, Avg, Min, Max, p95


Rather than special-casing Min / Max, just do p0 and p100 ?

BigLep · 2023-03-25T20:31:51Z

perf-dashboard/README.md

+
+* Stats on the length of the run:
+  * Total, Avg, Min, Max, p95
+* Round trips per second:


I'm trying to follow what this means.

I think we probably needs some vocabulary here.
I'm open to different terms, but I'm assuming we have a "test run" which will do "n_times" round trips.

I assume this means that at the end of the "test run" we eimit:
avg number of round trips per second that were completed
p0 number of round trips per second
p100 number of round trips per second
p95 number of round trips per second
etc.

This means that if the benchmark runner does a "test run" that lasts for ~10 seconds, it will be tracking for each of those seconds the number of round trips that complete each second so that it can then compute the avg/p0/p95/p100 stats.

Is that right?

Also, I assume the benchmark doesn't care how many concurrent connections there are. It just tracks how many round trips complete each second so it can compute the stats.

mxinden · 2023-04-24T07:54:10Z

I am closing here in favor of #163 which is further along. Let's continue any discussions there.

MarcoPolo added 7 commits March 16, 2023 12:20

WIP readme with thoughts

a26aada

Add example data

ac3bba3

Fix example data

dcd44bf

Extend example data

cd66695

Add dashboard link

b75cb3b

Update readme

343b2e5

Update readme

3e45f54

mxinden commented Mar 24, 2023

View reviewed changes

perf-dashboard/benchmark-data-schema.json Outdated Show resolved Hide resolved

mxinden added a commit to mxinden/rust-libp2p that referenced this pull request Mar 24, 2023

Print benchmark output according to json schema

ed6dd3f

See libp2p/test-plans#159 for schema discussion.

Remove wrong schema file

8d94c83

BigLep reviewed Mar 25, 2023

View reviewed changes

mxinden closed this Apr 24, 2023

BigLep mentioned this pull request Apr 26, 2023

feat(perf): add (provision, build, run) tooling #163

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add perf dashboard #159

Add perf dashboard #159

mxinden commented Mar 24, 2023

mxinden Mar 24, 2023

MarcoPolo Mar 24, 2023

MarcoPolo Mar 24, 2023

mxinden commented Mar 24, 2023

MarcoPolo commented Mar 24, 2023

BigLep left a comment

BigLep Mar 25, 2023

BigLep Mar 25, 2023

BigLep Mar 25, 2023

BigLep Mar 25, 2023

BigLep Mar 25, 2023 •

edited

Loading

mxinden commented Apr 24, 2023


		The binary should report the following:

		* Stats on the length of the run:

	* Stats on the length of the run:
	* Stats on the durations of the `n_times` runs:

Add perf dashboard #159

Add perf dashboard #159

Conversation

mxinden commented Mar 24, 2023

mxinden Mar 24, 2023

Choose a reason for hiding this comment

MarcoPolo Mar 24, 2023

Choose a reason for hiding this comment

MarcoPolo Mar 24, 2023

Choose a reason for hiding this comment

mxinden commented Mar 24, 2023

MarcoPolo commented Mar 24, 2023

BigLep left a comment

Choose a reason for hiding this comment

BigLep Mar 25, 2023

Choose a reason for hiding this comment

BigLep Mar 25, 2023

Choose a reason for hiding this comment

BigLep Mar 25, 2023

Choose a reason for hiding this comment

BigLep Mar 25, 2023

Choose a reason for hiding this comment

BigLep Mar 25, 2023 • edited Loading

Choose a reason for hiding this comment

mxinden commented Apr 24, 2023

BigLep Mar 25, 2023 •

edited

Loading