internal iteration for `&mut I` #100173

sarah-quinones · 2022-08-05T16:21:01Z

this pr implements internal iteration for &mut I when I: Sized. it additionally inlines some wrapper functions that were not previously inline, which seems to speed things up by a fair amount in some cases.

this lead to up to 3x performance gains across the board for iter:: benches, with only a minor regression for iter::bench_filter_sum

rust-highfive · 2022-08-05T16:21:04Z

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @scottmcm (or someone else) soon.

Please see the contribution instructions for more information.

scottmcm · 2022-08-05T18:22:35Z

@bors try @rust-timer queue

rust-timer · 2022-08-05T18:22:36Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2022-08-05T18:22:42Z

⌛ Trying commit cb7f7ee with merge 3e685715a7ece536b2ab653e3433c06c00454bdf...

the8472

This was tried several times before, the last one being #82185
Perhaps the changes on function.rs make a difference this time.

the8472 · 2022-08-05T18:53:18Z

library/core/src/iter/traits/double_ended.rs

+
+impl<'a, I: DoubleEndedIterator + Sized> ByRefRFold for &'a mut I {
+    #[inline]
+    default fn try_rfold<B, F, R>(&mut self, init: B, f: F) -> R


The more specific impl shouldn't have default

bors · 2022-08-05T20:00:27Z

☀️ Try build successful - checks-actions
Build commit: 3e685715a7ece536b2ab653e3433c06c00454bdf (3e685715a7ece536b2ab653e3433c06c00454bdf)

rust-timer · 2022-08-05T20:00:29Z

Queued 3e685715a7ece536b2ab653e3433c06c00454bdf with parent d77da9d, future comparison URL.

rust-timer · 2022-08-05T22:10:28Z

Finished benchmarking commit (3e685715a7ece536b2ab653e3433c06c00454bdf): comparison url.

Instruction count

Primary benchmarks: mixed results
Secondary benchmarks: mixed results

	mean¹	max	count²
Regressions 😿 (primary)	0.8%	25.3%	66
Regressions 😿 (secondary)	0.6%	1.9%	32
Improvements 🎉 (primary)	-0.5%	-1.5%	14
Improvements 🎉 (secondary)	-0.6%	-1.4%	22
All 😿🎉 (primary)	0.6%	25.3%	80

Max RSS (memory usage)

Results

Primary benchmarks: 🎉 relevant improvement found
Secondary benchmarks: mixed results

	mean¹	max	count²
Regressions 😿 (primary)	N/A	N/A	0
Regressions 😿 (secondary)	4.0%	4.0%	1
Improvements 🎉 (primary)	-2.3%	-2.3%	1
Improvements 🎉 (secondary)	-2.5%	-2.5%	1
All 😿🎉 (primary)	-2.3%	-2.3%	1

Cycles

Results

Primary benchmarks: 😿 relevant regressions found
Secondary benchmarks: 😿 relevant regressions found

	mean¹	max	count²
Regressions 😿 (primary)	14.7%	37.5%	3
Regressions 😿 (secondary)	2.8%	3.0%	2
Improvements 🎉 (primary)	N/A	N/A	0
Improvements 🎉 (secondary)	N/A	N/A	0
All 😿🎉 (primary)	14.7%	37.5%	3

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf +perf-regression

the arithmetic mean of the percent change ↩ ↩² ↩³
number of relevant changes ↩ ↩² ↩³

scottmcm · 2022-08-05T22:18:53Z

Wow, this totally changes the pattern of LTO in clap:

sarah-quinones · 2022-08-05T22:42:40Z

well, these don't look like the most promising results ^^'

compiler-errors · 2022-08-06T00:01:24Z

Presumably this makes compilation slower, but the perf tests don't show the effect on the performance of the compiled code, right?

sarah-quinones · 2022-08-06T09:43:31Z

are there tests that do?

the8472 · 2022-08-06T10:43:25Z

Other than the std benches (which aren't great) we don't have anything automated to assess runtime performance. In the rustc-perf suite check and doc builds are the closest since they don't codegen but they're probably not diverse enough.

You could try paring down the PR by splitting out some of the changes. E.g. some of the inlining in function.rs doesn't look relevant to iterators. You can also run rustc-perf locally and focus on that one benchmark, that should yield results more quickly (assuming you have a machine that can compile a stage1 rustc in a reasonable amount of time).

sarah-quinones · 2022-08-06T18:20:05Z

rustc-perf seems to take forever on my machine and i can't display the results after it's finished. so that doesn't seem like a good option for me :/

the8472 · 2022-08-06T18:33:34Z

It can be set to run a subset of the benchmarks, e.g. the serde ones. https://github.com/rust-lang/rustc-perf/tree/master/collector#benchmarking-options
Running the site locally should work as long as it uses the same DB as generated by the collector.

sarah-quinones · 2022-08-06T21:52:42Z

thanks for the tips! i managed to get it working thanks to your help. it seems that the biggest culprit was inlining the ops::function wrappers.
but even without it i still get a 1-2% regression on deeply-nested-multi

scottmcm · 2022-08-11T19:13:15Z

I'm going to send this over to

r? @m-ou-se

because I think this is going to be as much a policy decision (about compile-vs-runtime) as it is about the code itself.

the8472 · 2022-08-11T19:19:23Z

Some changes were reverted, let's get new perf results.

@bors try @rust-timer queue

rust-timer · 2022-08-11T19:19:25Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2022-08-11T19:19:33Z

⌛ Trying commit f6a3462 with merge c20ee6d211784a78b94e26d37cce4e66acea976a...

bors · 2022-08-11T20:56:37Z

☀️ Try build successful - checks-actions
Build commit: c20ee6d211784a78b94e26d37cce4e66acea976a (c20ee6d211784a78b94e26d37cce4e66acea976a)

rust-timer · 2022-08-11T20:56:38Z

Queued c20ee6d211784a78b94e26d37cce4e66acea976a with parent aeb5067, future comparison URL.

rust-timer · 2022-08-11T23:22:53Z

Finished benchmarking commit (c20ee6d211784a78b94e26d37cce4e66acea976a): comparison url.

Instruction count

Primary benchmarks: mixed results
Secondary benchmarks: ❌ relevant regressions found

	mean¹	max	count²
Regressions ❌ (primary)	0.2%	0.3%	14
Regressions ❌ (secondary)	0.7%	2.0%	16
Improvements ✅ (primary)	-0.4%	-0.7%	7
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.0%	-0.7%	21

Max RSS (memory usage)

Results

Primary benchmarks: no relevant changes found
Secondary benchmarks: mixed results

	mean¹	max	count²
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.4%	4.5%	8
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-3.4%	-4.2%	3
All ❌✅ (primary)	-	-	0

Cycles

Results

Primary benchmarks: ✅ relevant improvement found
Secondary benchmarks: ✅ relevant improvement found

	mean¹	max	count²
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-2.3%	-2.3%	1
Improvements ✅ (secondary)	-4.1%	-4.1%	1
All ❌✅ (primary)	-2.3%	-2.3%	1

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf +perf-regression

the arithmetic mean of the percent change ↩ ↩² ↩³
number of relevant changes ↩ ↩² ↩³

m-ou-se · 2022-12-30T13:11:04Z

r? @the8472

the8472 · 2022-12-30T17:32:40Z

The compile-time perf numbers are slightly negative, but less so than the previous attempt to do this.

But we need some runtime benchmark numbers to verify that it brings the expected benefits. There are some core::iter benchmarks that I'd expect to show some speedup.

@rustbot author

Dylan-DPC · 2023-01-23T12:33:05Z

@sarah-ek any updates on this?

Dylan-DPC · 2023-05-16T12:37:02Z

Closing this as inactive. Feel free to reöpen this pr or create a new pr if you get the time to work on this. Thanks

feat: impl internal iteration for &mut I

cb7f7ee

rustbot added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Aug 5, 2022

rust-highfive assigned scottmcm Aug 5, 2022

This comment was marked as resolved.

Sign in to view

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 5, 2022

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Aug 5, 2022

the8472 reviewed Aug 5, 2022

View reviewed changes

rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Aug 5, 2022

undo changes to function.rs

f6a3462

rust-highfive assigned m-ou-se and unassigned scottmcm Aug 11, 2022

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Aug 11, 2022

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Aug 11, 2022

JohnCSimon added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 8, 2022

rustbot assigned the8472 and unassigned m-ou-se Dec 30, 2022

rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 30, 2022

the8472 mentioned this pull request May 4, 2023

Optimize Iterator implementation for &mut impl Iterator + Sized #111200

Merged

Dylan-DPC closed this May 16, 2023

Dylan-DPC added S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels May 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

internal iteration for `&mut I` #100173

internal iteration for `&mut I` #100173

sarah-quinones commented Aug 5, 2022

This comment was marked as resolved.

rust-highfive commented Aug 5, 2022

scottmcm commented Aug 5, 2022

rust-timer commented Aug 5, 2022

bors commented Aug 5, 2022

the8472 left a comment

the8472 Aug 5, 2022

bors commented Aug 5, 2022

rust-timer commented Aug 5, 2022

rust-timer commented Aug 5, 2022

scottmcm commented Aug 5, 2022

sarah-quinones commented Aug 5, 2022

compiler-errors commented Aug 6, 2022

sarah-quinones commented Aug 6, 2022

the8472 commented Aug 6, 2022

sarah-quinones commented Aug 6, 2022

the8472 commented Aug 6, 2022

sarah-quinones commented Aug 6, 2022

scottmcm commented Aug 11, 2022

the8472 commented Aug 11, 2022

rust-timer commented Aug 11, 2022

bors commented Aug 11, 2022

bors commented Aug 11, 2022

rust-timer commented Aug 11, 2022

rust-timer commented Aug 11, 2022

m-ou-se commented Dec 30, 2022

the8472 commented Dec 30, 2022

Dylan-DPC commented Jan 23, 2023

Dylan-DPC commented May 16, 2023

internal iteration for &mut I #100173

internal iteration for &mut I #100173

Conversation

sarah-quinones commented Aug 5, 2022

This comment was marked as resolved.

rust-highfive commented Aug 5, 2022

scottmcm commented Aug 5, 2022

rust-timer commented Aug 5, 2022

bors commented Aug 5, 2022

the8472 left a comment

Choose a reason for hiding this comment

the8472 Aug 5, 2022

Choose a reason for hiding this comment

bors commented Aug 5, 2022

rust-timer commented Aug 5, 2022

rust-timer commented Aug 5, 2022

Instruction count

Max RSS (memory usage)

Cycles

Footnotes

scottmcm commented Aug 5, 2022

sarah-quinones commented Aug 5, 2022

compiler-errors commented Aug 6, 2022

sarah-quinones commented Aug 6, 2022

the8472 commented Aug 6, 2022

sarah-quinones commented Aug 6, 2022

the8472 commented Aug 6, 2022

sarah-quinones commented Aug 6, 2022

scottmcm commented Aug 11, 2022

the8472 commented Aug 11, 2022

rust-timer commented Aug 11, 2022

bors commented Aug 11, 2022

bors commented Aug 11, 2022

rust-timer commented Aug 11, 2022

rust-timer commented Aug 11, 2022

Instruction count

Max RSS (memory usage)

Cycles

Footnotes

m-ou-se commented Dec 30, 2022

the8472 commented Dec 30, 2022

Dylan-DPC commented Jan 23, 2023

Dylan-DPC commented May 16, 2023

internal iteration for `&mut I` #100173

internal iteration for `&mut I` #100173