Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add perf dashboard #159

Closed
wants to merge 8 commits into from
Closed
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 62 additions & 0 deletions perf-dashboard/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# Perf Dashboard [wip]

This is an outline of what I'd like to see our perf dashboard look like.

A lot of this is inspired by the [MsQuic
dashboard](https://microsoft.github.io/msquic/). Please look at that first.

For each combination of libp2p implementation, version, and transport we would
have numbers that outline:
1. Download/Upload throughput
2. Request latency
3. Requests per second (for some request/response protocol)
4. Handshakes per second (useful to identify overhead in connection
initialization).
5. Memory usage.

The y axis on the graphs is the value for the above tests, the x axis is the
specific version. Different lines represent different implementation+transports.

The dashboards should be selectable and filterable.

# Other transports (iperf/http)

We have to be careful to compare apples to apples. A raw iperf number might be
confusing here because no application will ever hit those numbers since they
will at least want some encryption in their connection. I would suggest not
having this or an HTTP comparison. Having HTTPS might be okay.

# Example dashboard

https://observablehq.com/@realmarcopolo/libp2p-perf

The dashboard automatically pulls data from this repo to display it.

It currently pulls example-data.json. The schema of this data is defined in
`benchmarks.schema.json` and `benchmark-result-type.ts`.

# Benchmark runner

This is the thing that runs the benchmark binaries and produces the benchmark
data (that matches the benchmarks.schema.json/benchmark-result-type.ts)

# Benchmark binary

This is per implementation. It's a binary that accepts the following flags:

| Flag | description |
| -------------------- | ------------------------------------------------------ |
| parallel_connections | number of parallel connections to use |
| upload_bytes | number of bytes to upload to server per connection |
| download_bytes | number of bytes to download from server per connection |
| n_times | number of times to do the perf round trip |
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the number of times one executes a libp2p perf round trip (open stream, sent # bytes to receive, sent bytes, receive bytes) per connection? Or is this the number of times one should execute the whole benchmark?

In case of the former, we still need a parameter on how often the binary should execute the whole test, otherwise it can not emit aggregates like avg, min, max or p95.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this the number of times one should execute the whole benchmark

Yes. I was thinking this would be shared amongst all connections. So you could say have n_times = 100, and parallel_connections = 10 and have an average of 10 roundtrips per conn, but some may have more or less in practice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are all the params the msquic tool accepts: https://github.com/microsoft/msquic/tree/main/src/perf

| close_connection | bool. Close the connection after a perf round trip |

The binary should report the following:

* Stats on the length of the run:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Stats on the length of the run:
* Stats on the durations of the `n_times` runs:

I assume that is correct (and I think makes it more clear)

* Total, Avg, Min, Max, p95
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than special-casing Min / Max, just do p0 and p100 ?

* Round trips per second:
Copy link
Contributor

@BigLep BigLep Mar 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to follow what this means.

I think we probably needs some vocabulary here.
I'm open to different terms, but I'm assuming we have a "test run" which will do "n_times" round trips.

I assume this means that at the end of the "test run" we eimit:
avg number of round trips per second that were completed
p0 number of round trips per second
p100 number of round trips per second
p95 number of round trips per second
etc.

This means that if the benchmark runner does a "test run" that lasts for ~10 seconds, it will be tracking for each of those seconds the number of round trips that complete each second so that it can then compute the avg/p0/p95/p100 stats.

Is that right?

Also, I assume the benchmark doesn't care how many concurrent connections there are. It just tracks how many round trips complete each second so it can compute the stats.

* Avg, Min, Max, p95

Maybe more? TODO...
45 changes: 45 additions & 0 deletions perf-dashboard/benchmark-data-schema.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://libp2p.io/benchmark.schema.json",
"title": "Benchmark Results",
"description": "Results from a benchmark run",
"type": "object",
"properties": {
"benchmarks": {
"description": "A list of benchmark results",
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {
"description": "The name of this benchmark",
"type": "string"
},
"unit": {
"description": "The unit for the result",
"enum": [
"bits/s",
"s"
]
},
"result": {
"description": "String encoded result. Parse as float64",
"type": "string"
}
},
"required": [
"name",
"unit",
"result",
"implementation",
"stack",
"version"
MarcoPolo marked this conversation as resolved.
Show resolved Hide resolved
]
}
}
},
"required": [
"benchmarks",
"productName"
]
}
25 changes: 25 additions & 0 deletions perf-dashboard/benchmark-result-type.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
export type BenchmarkResults = {
benchmarks: Benchmark[],
// For referencing this schema in JSON
"$schema"?: string
};

export type Benchmark = {
name: string,
unit: "bits/s" | "s",
results: Result[],
comparisons: Comparison[],

}

export type Result = {
result: number,
implementation: string,
transportStack: string,
version: string
};

export type Comparison = {
name: string,
result: number,
}
98 changes: 98 additions & 0 deletions perf-dashboard/benchmarks.schema.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$ref": "#/definitions/BenchmarkResults",
"definitions": {
"BenchmarkResults": {
"type": "object",
"properties": {
"benchmarks": {
"type": "array",
"items": {
"$ref": "#/definitions/Benchmark"
}
},
"$schema": {
"type": "string"
}
},
"required": [
"benchmarks"
],
"additionalProperties": false
},
"Benchmark": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"unit": {
"type": "string",
"enum": [
"bits/s",
"s"
]
},
"results": {
"type": "array",
"items": {
"$ref": "#/definitions/Result"
}
},
"comparisons": {
"type": "array",
"items": {
"$ref": "#/definitions/Comparison"
}
}
},
"required": [
"name",
"unit",
"results",
"comparisons"
],
"additionalProperties": false
},
"Result": {
"type": "object",
"properties": {
"result": {
"type": "number"
},
"implementation": {
"type": "string"
},
"transportStack": {
"type": "string"
},
"version": {
"type": "string"
}
},
"required": [
"result",
"implementation",
"transportStack",
"version"
],
"additionalProperties": false
},
"Comparison": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"result": {
"type": "number"
}
},
"required": [
"name",
"result"
],
"additionalProperties": false
}
}
}
161 changes: 161 additions & 0 deletions perf-dashboard/example-data.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,161 @@
{
"$schema": "./benchmarks.schema.json",
"benchmarks": [
{
"name": "Single Connection throughput – Upload",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do I see n, p0, p95,p100, and avg ?

"unit": "bits/s",
"comparisons": [
{
"name": "http",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume https instead of http?

Also, do we want to be more clear than just "https"? Should we specify whether this is using go stdlibrary, rust stdlibrary, curl, etc? (and if so, I assume the version of go, rust, curl, etc. matters). At which point is a "comparison" very different than a result?

"result": 1234
}
],
"results": [
{
"result": 1100,
"implementation": "go-libp2p",
"transportStack": "quic-v1",
"version": "v0.24.2"
},
{
"result": 1000,
"implementation": "go-libp2p",
"transportStack": "quic-v1",
"version": "v0.25.1"
},
{
"result": 8234,
"implementation": "go-libp2p",
"transportStack": "quic-v1",
"version": "v0.26.2"
},
{
"result": 4000,
"implementation": "rust-libp2p",
"transportStack": "quic-v1",
"version": "v0.50.0"
},
{
"result": 8000,
"implementation": "rust-libp2p",
"transportStack": "quic-v1",
"version": "v0.51.0"
},
{
"result": 6000,
"implementation": "rust-libp2p",
"transportStack": "quic-v1",
"version": "v0.52.0"
},
{
"result": 9001,
"implementation": "zig-libp2p",
"transportStack": "quic-v1",
"version": "v0.0.1"
},
{
"result": 9002,
"implementation": "zig-libp2p",
"transportStack": "quic-v1",
"version": "v0.0.2"
},
{
"result": 201,
"implementation": "js-libp2p",
"transportStack": "tcp+noise+yamux",
"version": "v0.41.0"
},
{
"result": 302,
"implementation": "js-libp2p",
"transportStack": "tcp+noise+yamux",
"version": "v0.42.0"
},
{
"result": 501,
"implementation": "js-libp2p",
"transportStack": "tcp+noise+yamux",
"version": "v0.43.0"
}
]
},
{
"name": "Single Connection 1 byte round trip latency",
"unit": "s",
"comparisons": [
{
"name": "http",
"result": 1.234
}
],
"results": [
{
"result": 0.100,
"implementation": "go-libp2p",
"transportStack": "quic-v1",
"version": "v0.24.2"
},
{
"result": 0.100,
"implementation": "go-libp2p",
"transportStack": "quic-v1",
"version": "v0.25.1"
},
{
"result": 0.834,
"implementation": "go-libp2p",
"transportStack": "quic-v1",
"version": "v0.26.2"
},
{
"result": 0.400,
"implementation": "rust-libp2p",
"transportStack": "quic-v1",
"version": "v0.50.0"
},
{
"result": 0.800,
"implementation": "rust-libp2p",
"transportStack": "quic-v1",
"version": "v0.51.0"
},
{
"result": 0.600,
"implementation": "rust-libp2p",
"transportStack": "quic-v1",
"version": "v0.52.0"
},
{
"result": 0.901,
"implementation": "zig-libp2p",
"transportStack": "quic-v1",
"version": "v0.0.1"
},
{
"result": 0.902,
"implementation": "zig-libp2p",
"transportStack": "quic-v1",
"version": "v0.0.2"
},
{
"result": 0.201,
"implementation": "js-libp2p",
"transportStack": "tcp+noise+yamux",
"version": "v0.41.0"
},
{
"result": 0.302,
"implementation": "js-libp2p",
"transportStack": "tcp+noise+yamux",
"version": "v0.42.0"
},
{
"result": 0.501,
"implementation": "js-libp2p",
"transportStack": "tcp+noise+yamux",
"version": "v0.43.0"
}
]
}
]
}
Loading