Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ed25519_verify_cpu: remove allocation in hot path #3941

Merged
merged 1 commit into from
Dec 5, 2024

Conversation

alessandrod
Copy link

The old code was a remnant from a time when we tried to "balance" the amount of work across threads by re-batching and therefore allocating.

This is a hot path. Allocations are bad. Let work stealing work its magic.

before

running 8 tests
test bench_sigverify_high_packets_large_batch   ... bench:  16,990,766.60 ns/iter (+/- 3,061,142.38)
test bench_sigverify_high_packets_small_batch   ... bench:  17,250,462.50 ns/iter (+/- 2,758,672.55)
test bench_sigverify_low_packets_large_batch    ... bench:   9,067,963.60 ns/iter (+/- 136,488.70)
test bench_sigverify_low_packets_small_batch    ... bench:   9,054,538.40 ns/iter (+/- 125,983.52)
test bench_sigverify_medium_packets_large_batch ... bench:  17,483,717.50 ns/iter (+/- 6,094,133.22)
test bench_sigverify_medium_packets_small_batch ... bench:  18,705,141.90 ns/iter (+/- 5,872,418.63)
test bench_sigverify_simple                     ... bench:  20,167,362.70 ns/iter (+/- 2,705,726.58)
test bench_sigverify_uneven                     ... bench:   7,192,356.90 ns/iter (+/- 815,398.52)

after

running 8 tests
test bench_sigverify_high_packets_large_batch   ... bench:  10,494,283.30 ns/iter (+/- 852,231.77)
test bench_sigverify_high_packets_small_batch   ... bench:  10,249,499.70 ns/iter (+/- 226,444.85)
test bench_sigverify_low_packets_large_batch    ... bench:     465,018.57 ns/iter (+/- 21,994.14)
test bench_sigverify_low_packets_small_batch    ... bench:     461,375.30 ns/iter (+/- 24,855.80)
test bench_sigverify_medium_packets_large_batch ... bench:   2,650,445.62 ns/iter (+/- 177,836.54)
test bench_sigverify_medium_packets_small_batch ... bench:   2,779,869.52 ns/iter (+/- 145,892.43)
test bench_sigverify_simple                     ... bench:     768,623.86 ns/iter (+/- 56,839.41)
test bench_sigverify_uneven                     ... bench:   1,325,557.05 ns/iter (+/- 92,470.47)

The old code was a remnant from a time when we tried to "balance" the
amount of work across threads by re-batching and therefore allocating.

This is a hot path. Allocations are bad. Let work stealing work its
magic.
@alessandrod
Copy link
Author

on a single node loaded with bench-tps, almost doubles the amount of valid packets that make it past sigverify

Screenshot 2024-12-05 at 9 22 18 pm

@alessandrod alessandrod requested a review from bw-solana December 5, 2024 13:03
@bw-solana
Copy link

The work stealing can happen within a packet batch?

@alessandrod
Copy link
Author

The work stealing can happen within a packet batch?

Yes because we're flattening

@apfitzge
Copy link

apfitzge commented Dec 5, 2024

The work stealing can happen within a packet batch?

Yes because we're flattening

Oh. Hmm I thought this was not true, but it is: https://docs.rs/rayon/latest/rayon/iter/trait.ParallelIterator.html#method.flatten

Copy link

@apfitzge apfitzge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@alessandrod alessandrod added the v2.1 Backport to v2.1 branch label Dec 5, 2024
@alessandrod alessandrod merged commit a53a876 into anza-xyz:master Dec 5, 2024
41 checks passed
Copy link

mergify bot commented Dec 5, 2024

Backports to the beta branch are to be avoided unless absolutely necessary for fixing bugs, security issues, and perf regressions. Changes intended for backport should be structured such that a minimum effective diff can be committed separately from any refactoring, plumbing, cleanup, etc that are not strictly necessary to achieve the goal. Any of the latter should go only into master and ride the normal stabilization schedule. Exceptions include CI/metrics changes, CLI improvements and documentation updates on a case by case basis.

mergify bot pushed a commit that referenced this pull request Dec 5, 2024
The old code was a remnant from a time when we tried to "balance" the
amount of work across threads by re-batching and therefore allocating.

This is a hot path. Allocations are bad. Let work stealing work its
magic.

(cherry picked from commit a53a876)
steviez pushed a commit that referenced this pull request Jan 8, 2025
…3941) (#3945)

ed25519_verify_cpu: remove allocation in hot path (#3941)

The old code was a remnant from a time when we tried to "balance" the
amount of work across threads by re-batching and therefore allocating.

This is a hot path. Allocations are bad. Let work stealing work its
magic.

(cherry picked from commit a53a876)

Co-authored-by: Alessandro Decina <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v2.1 Backport to v2.1 branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants