-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Experiment] Rewrite exhaustiveness in one pass #116042
Conversation
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
[Experiment] Rewrite exhaustiveness in one pass Arm reachability checking does a quadratic amount of work: for each arm we check if it is reachable given the arms above it. This feels wasteful since we often end up re-exploring the same cases when we check for exhaustiveness. This PR is an attempt to check reachability at the same time as exhaustiveness. This opens the door to a bunch of code simplifications I'm very excited about. The main question is whether I can get actual performance gains out of this. I had started the experiment in rust-lang#111720 but I can't reopen it. r? `@ghost`
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (aa12554): comparison URL. Overall result: ❌✅ regressions and improvements - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 632.84s -> 633.392s (0.09%) |
Is this good enough? Should I clean it up and submit? |
On perf only, (haven't looked at the code yet), the improvements and regressions roughly balance out, that's good. I'm a bit worried about the 6% regression on the stress test though. Do you have an explanation for this one ? Or should it be balanced with the algorithm improvements? |
The last run of the previous PR had +/- 5% cycle wins on the stress test. Maybe we should start where the last one ended? :) |
Another prior run also looked great, with no icounts regressions and even better cycles #111720 (comment) |
Unfortunately that commit was on a WIP that didn't correctly handle the omitted_patterns lint if I recall correctly
I wish I did, locally I measure a 10% improvement :'( This benchmark baffles me
Good point, I'll look into it |
b5728f1
to
1efc54c
Compare
This single commit causes a regression on match-stress-enum locally, which makes no sense at all. Let's check that, at this point I think this benchmark is haunted on my machine @bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
[Experiment] Rewrite exhaustiveness in one pass Arm reachability checking does a quadratic amount of work: for each arm we check if it is reachable given the arms above it. This feels wasteful since we often end up re-exploring the same cases when we check for exhaustiveness. This PR is an attempt to check reachability at the same time as exhaustiveness. This opens the door to a bunch of code simplifications I'm very excited about. The main question is whether I can get actual performance gains out of this. I had started the experiment in rust-lang#111720 but I can't reopen it. r? `@ghost`
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (dde0381): comparison URL. Overall result: ❌ regressions - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 631.219s -> 629.891s (-0.21%) |
Alright well, at least it's consistent with my local measurements, and I've explained the regression I had on the 40-something commits earlier. It's this: This regression however doesn't make any sense to me. This commit reduces the size of |
This is the cachegrind diff of the latest perf. run
600M more instructions in |
I confirm it's purely a layout issue, and specifically I suspect a padding issue. Turns out for my librarification plans I have other layout changes planned that will make the size work, so I'll do these first |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (214eab1): comparison URL. Overall result: ❌✅ regressions and improvements - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 630.107s -> 630.022s (-0.01%) |
1d158a8
to
5883018
Compare
Coming back to this after much distraction from never-patterns and friends. @bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
[Experiment] Rewrite exhaustiveness in one pass Arm reachability checking does a quadratic amount of work: for each arm we check if it is reachable given the arms above it. This feels wasteful since we often end up re-exploring the same cases when we check for exhaustiveness. This PR is an attempt to check reachability at the same time as exhaustiveness. This opens the door to a bunch of code simplifications I'm very excited about. The main question is whether I can get actual performance gains out of this. I had started the experiment in rust-lang#111720 but I can't reopen it. r? `@ghost`
This comment has been minimized.
This comment has been minimized.
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (6ce98c7): comparison URL. Overall result: ❌✅ regressions and improvements - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 635.352s -> 635.43s (0.01%) |
Pretty great on cycles :3 |
fc12224
to
9bfb7d9
Compare
This comment has been minimized.
This comment has been minimized.
9bfb7d9
to
d4f08e7
Compare
Attempting a for real PR for this here: #117611 |
Closing this as #117611 is now created |
Arm reachability checking does a quadratic amount of work: for each arm we check if it is reachable given the arms above it. This feels wasteful since we often end up re-exploring the same cases when we check for exhaustiveness.
This PR is an attempt to check reachability at the same time as exhaustiveness. This opens the door to a bunch of code simplifications I'm very excited about. The main question is whether I can get actual performance gains out of this.
I had started the experiment in #111720 but I can't reopen it.
r? @ghost