BTree: inline more functions supporting iteration #85217

ssomers · 2021-05-12T09:13:01Z

According to the library/alloc benchmarks, these #[inline] might boost performance. They didn't seem to a year or so ago.

r? @Mark-Simulacrum

ssomers · 2021-05-12T09:17:45Z

Note that the ones on the deallocating_ functions don't appear to help (but mutable benchmarks are treacherous anyway) and that btree::map::clone_slim_100 just reverts the freak boost in #84904.

benchcmp a b --threshold 5
 name                                           a ns/iter  b ns/iter  diff ns/iter   diff %  speedup
 btree::map::clone_slim_100                     1,483      4,356             2,873  193.73%   x 0.34
 btree::map::find_seq_100                       9          10                    1   11.11%   x 0.90
 btree::map::first_and_last_0                   10         12                    2   20.00%   x 0.83
 btree::map::first_and_last_10k                 56         65                    9   16.07%   x 0.86
 btree::map::insert_rand_10_000                 38         44                    6   15.79%   x 0.86
 btree::map::iter_100                           4,292      4,523               231    5.38%   x 0.95
 btree::map::iter_10k                           5,646      6,001               355    6.29%   x 0.94
 btree::map::iteration_1000                     4,235      3,564              -671  -15.84%   x 1.19
 btree::map::iteration_20                       68         62                   -6   -8.82%   x 1.10
 btree::map::iteration_mut_1000                 4,304      4,013              -291   -6.76%   x 1.07
 btree::set::difference_random_100_vs_100       997        771                -226  -22.67%   x 1.29
 btree::set::difference_random_10k_vs_100       68,320     59,853           -8,467  -12.39%   x 1.14
 btree::set::difference_random_10k_vs_10k       218,385    191,578         -26,807  -12.28%   x 1.14
 btree::set::difference_staggered_100_vs_100    1,146      841                -305  -26.61%   x 1.36
 btree::set::difference_staggered_100_vs_10k    2,352      2,179              -173   -7.36%   x 1.08
 btree::set::difference_staggered_10k_vs_10k    113,332    82,685          -30,647  -27.04%   x 1.37
 btree::set::intersection_100_pos_vs_100_neg    16         17                    1    6.25%   x 0.94
 btree::set::intersection_10k_neg_vs_100_pos    17         18                    1    5.88%   x 0.94
 btree::set::intersection_10k_neg_vs_10k_pos    18         19                    1    5.56%   x 0.95
 btree::set::intersection_10k_pos_vs_10k_neg    18         19                    1    5.56%   x 0.95
 btree::set::intersection_random_100_vs_100     713        626                 -87  -12.20%   x 1.14
 btree::set::intersection_staggered_100_vs_100  661        617                 -44   -6.66%   x 1.07
 btree::set::intersection_staggered_10k_vs_10k  64,105     60,344           -3,761   -5.87%   x 1.06
 btree::set::is_subset_100_vs_100               533        599                  66   12.38%   x 0.89
 btree::set::is_subset_10k_vs_10k               62,611     59,448           -3,163   -5.05%   x 1.05

Mark-Simulacrum · 2021-05-14T22:04:44Z

@bors try @rust-timer queue

I am generally not inclined to merge this - #[inline] on these functions increases the chances of getting inlined, probably, but it doesn't change the ability (they're already generic) of LLVM to do so. I'm not usually inclined to mess with LLVM heuristics in std code. But, we'll see if there's any change on compiler benchmarks...

rust-timer · 2021-05-14T22:04:45Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2021-05-14T22:04:53Z

⌛ Trying commit 154e6d1 with merge 51fbd179ae85719260f05675a60575ad1e15790f...

bors · 2021-05-14T23:05:23Z

☀️ Try build successful - checks-actions
Build commit: 51fbd179ae85719260f05675a60575ad1e15790f (51fbd179ae85719260f05675a60575ad1e15790f)

rust-timer · 2021-05-14T23:05:25Z

Queued 51fbd179ae85719260f05675a60575ad1e15790f with parent 1025db8, future comparison URL.

ssomers · 2021-05-15T00:32:11Z

It sure seems these inlines are as scientific as needles in a voodoo doll - they keep you awake at night and nobody can explain how they work. I sometimes add them in a PR to "disguise" a reported performance backlash - on functions that are extracted from literally inlined code, like in #84904.

I don't think I ever tried removing some of them. When I remove all of them from btree code (most are on public, stable API members), there's only backlash (on my Windows x64 system, Intel CPU) compared to master.

benchcmp a c --threshold 5
 name                                         a ns/iter  c ns/iter  diff ns/iter  diff %  speedup
 btree::map::clone_fat_val_100_and_clear      10,761     11,459              698   6.49%   x 0.94
 btree::map::clone_fat_val_100_and_into_iter  10,151     11,173            1,022  10.07%   x 0.91
 btree::map::clone_slim_100_and_into_iter     4,269      4,790               521  12.20%   x 0.89
 btree::map::clone_slim_10k_and_into_iter     183,788    242,990          59,202  32.21%   x 0.76
 btree::map::first_and_last_0                 10         12                    2  20.00%   x 0.83
 btree::map::first_and_last_10k               56         67                   11  19.64%   x 0.84
 btree::map::insert_rand_10_000               38         42                    4  10.53%   x 0.90
 btree::map::iteration_mut_1000               4,304      4,576               272   6.32%   x 0.94
 btree::map::iteration_mut_100000             534,390    603,730          69,340  12.98%   x 0.89
 btree::map::iteration_mut_20                 68         77                    9  13.24%   x 0.88
 btree::set::clone_100_and_into_iter          1,413      1,933               520  36.80%   x 0.73
 btree::set::clone_10k                        159,812    173,932          14,120   8.84%   x 0.92
 btree::set::clone_10k_and_into_iter          168,626    227,017          58,391  34.63%   x 0.74
 btree::set::difference_random_10k_vs_100     68,320     73,951            5,631   8.24%   x 0.92
 btree::set::is_subset_100_vs_100             533        630                  97  18.20%   x 0.85
 btree::set::is_subset_100_vs_10k             1,185      1,258                73   6.16%   x 0.94

But compared to this PR, clone_slim_100 is happy again.

rust-timer · 2021-05-15T00:46:37Z

Finished benchmarking try commit (51fbd179ae85719260f05675a60575ad1e15790f): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

Mark-Simulacrum · 2021-05-15T00:48:07Z

Looks loosely like this PR is a regression.

BTree: inline more functions supporting iteration

154e6d1

rust-highfive assigned Mark-Simulacrum May 12, 2021

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 12, 2021

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 14, 2021

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 15, 2021

Mark-Simulacrum closed this May 15, 2021

ssomers deleted the btree_inlining branch May 23, 2021 09:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BTree: inline more functions supporting iteration #85217

BTree: inline more functions supporting iteration #85217

ssomers commented May 12, 2021

ssomers commented May 12, 2021 •

edited

Loading

Mark-Simulacrum commented May 14, 2021

rust-timer commented May 14, 2021

bors commented May 14, 2021

bors commented May 14, 2021

rust-timer commented May 14, 2021

ssomers commented May 15, 2021

rust-timer commented May 15, 2021

Mark-Simulacrum commented May 15, 2021

BTree: inline more functions supporting iteration #85217

BTree: inline more functions supporting iteration #85217

Conversation

ssomers commented May 12, 2021

ssomers commented May 12, 2021 • edited Loading

Mark-Simulacrum commented May 14, 2021

rust-timer commented May 14, 2021

bors commented May 14, 2021

bors commented May 14, 2021

rust-timer commented May 14, 2021

ssomers commented May 15, 2021

rust-timer commented May 15, 2021

Mark-Simulacrum commented May 15, 2021

ssomers commented May 12, 2021 •

edited

Loading