Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Jmh benchmark submodule #109

Merged
merged 1 commit into from
Jun 3, 2021
Merged

Add Jmh benchmark submodule #109

merged 1 commit into from
Jun 3, 2021

Conversation

spkrka
Copy link
Member

@spkrka spkrka commented Jun 3, 2021

Summary:
The distribution metric is faster when it's not contended,
but degrades during heavy contention.

Analysis:
Histogram throughput scales almost linearly with number of cores,
and maxes out at 52 ops/us, both with 4 and 8 threads.
Contention does not appear to be an issue there.

Distribution throughput is best when running with a single thread,
maxing out at 47 ops/us and drops down to 13 ops/us when running
with 4 threads. Going to 8 threads actually increases the throughput
though.

With a single thread, histogram runs in 0.055 us/op on average
and distribution runs in 0.021 us/op which is more than twice as fast.

However, when running with 4 threads concurrently, the distribution
average speed is 0.325 us/op (15x slower) compared to histogram which runs at
0.073 us/op (1.3x slower).

REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.

Benchmark                                            Mode       Cnt      Score    Error   Units
DistributionBenchmark.dist1                         thrpt        10     47.352 ±  0.815  ops/us
DistributionBenchmark.dist2                         thrpt        10     16.576 ±  0.142  ops/us
DistributionBenchmark.dist4                         thrpt        10     13.032 ±  0.244  ops/us
DistributionBenchmark.dist8                         thrpt        10     26.443 ±  0.857  ops/us
DistributionBenchmark.hist1                         thrpt        10     18.029 ±  0.069  ops/us
DistributionBenchmark.hist2                         thrpt        10     35.253 ±  0.570  ops/us
DistributionBenchmark.hist4                         thrpt        10     52.909 ±  1.103  ops/us
DistributionBenchmark.hist8                         thrpt        10     53.726 ±  1.669  ops/us
DistributionBenchmark.dist1                          avgt        10      0.021 ±  0.001   us/op
DistributionBenchmark.dist2                          avgt        10      0.127 ±  0.007   us/op
DistributionBenchmark.dist4                          avgt        10      0.302 ±  0.012   us/op
DistributionBenchmark.dist8                          avgt        10      0.325 ±  0.030   us/op
DistributionBenchmark.hist1                          avgt        10      0.055 ±  0.001   us/op
DistributionBenchmark.hist2                          avgt        10      0.057 ±  0.001   us/op
DistributionBenchmark.hist4                          avgt        10      0.073 ±  0.003   us/op
DistributionBenchmark.hist8                          avgt        10      0.148 ±  0.006   us/op

Summary:
The distribution metric is faster when it's not contended,
but degrades during heavy contention.

Analysis:
Histogram throughput scales almost linearly with number of cores,
and maxes out at 52 ops/us, both with 4 and 8 threads.
Contention does not appear to be an issue there.

Distribution throughput is best when running with a single thread,
maxing out at 47 ops/us and drops down to 13 ops/us when running
with 4 threads. Going to 8 threads actually increases the throughput
though.

With a single thread, histogram runs in 0.055 us/op on average
and distribution runs in 0.021 us/op which is more than twice as fast.

However, when running with 4 threads concurrently, the distribution
average speed is 0.325 us/op (15x slower) compared to histogram which runs at
0.073 us/op (1.3x slower).

REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.

    Benchmark                                            Mode       Cnt      Score    Error   Units
    DistributionBenchmark.dist1                         thrpt        10     47.352 ±  0.815  ops/us
    DistributionBenchmark.dist2                         thrpt        10     16.576 ±  0.142  ops/us
    DistributionBenchmark.dist4                         thrpt        10     13.032 ±  0.244  ops/us
    DistributionBenchmark.dist8                         thrpt        10     26.443 ±  0.857  ops/us
    DistributionBenchmark.hist1                         thrpt        10     18.029 ±  0.069  ops/us
    DistributionBenchmark.hist2                         thrpt        10     35.253 ±  0.570  ops/us
    DistributionBenchmark.hist4                         thrpt        10     52.909 ±  1.103  ops/us
    DistributionBenchmark.hist8                         thrpt        10     53.726 ±  1.669  ops/us
    DistributionBenchmark.dist1                          avgt        10      0.021 ±  0.001   us/op
    DistributionBenchmark.dist2                          avgt        10      0.127 ±  0.007   us/op
    DistributionBenchmark.dist4                          avgt        10      0.302 ±  0.012   us/op
    DistributionBenchmark.dist8                          avgt        10      0.325 ±  0.030   us/op
    DistributionBenchmark.hist1                          avgt        10      0.055 ±  0.001   us/op
    DistributionBenchmark.hist2                          avgt        10      0.057 ±  0.001   us/op
    DistributionBenchmark.hist4                          avgt        10      0.073 ±  0.003   us/op
    DistributionBenchmark.hist8                          avgt        10      0.148 ±  0.006   us/op
@ao2017 ao2017 merged commit 0412962 into master Jun 3, 2021
@delete-merged-branch delete-merged-branch bot deleted the krka/jmh branch June 3, 2021 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants