-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add fast path for match checking #76918
Conversation
@bors try @rust-timer queue |
Awaiting bors try build completion |
⌛ Trying commit 7c98f6f with merge 52836408ae482f87159a2473c3e5475348b1f255... |
☀️ Try build successful - checks-actions, checks-azure |
Queued 52836408ae482f87159a2473c3e5475348b1f255 with parent fd702d2, future comparison URL. |
Finished benchmarking try commit (52836408ae482f87159a2473c3e5475348b1f255): comparison url. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up. @bors rollup=never |
Seems that all perf results are nearly identical. I guess that match checking wasn't dominating the compile time then. It seems that this PR only helps in pathological cases like that generated by the script below. Some crate in the ecosystem might be doing the same kind of code generation, though, and this can be become useful in such cases. Python script: print("""#![allow(warnings)]
enum A {""")
for i in range(8192):
print(f" Var{i},")
print("""}
fn main() {
match A::Var0 {""")
for i in range(8192):
print(f" A::Var{i} => {{}}")
print(""" }
}""") Old rustc takes around 2.7s, this build takes around 1.7s. |
Btw. Addressing your concerns about FxHashMap overhead - #72412 added a MiniMap for similar reasons (it's SSO map backed by ArrayVec). You can try using that here, seems like the map will always hold small number of elements, too, if I understand it correctly. |
It seems that |
I added some more comments explaining the code per request. |
Thanks, that helps a lot! One thing that we could do is to add your pathological case to the perf test suite, then, once that's in, this PR will show an improvement on the perf bot and we thus guard against future regressions. Another thing is to |
I added the debug assertions, it has a clone that is not really necessary but I thought the code would be easier to maintain this way. As for perf, I have mixed feelings since the case I mentioned above is only one kind of match that is slow; I think we could add it once we have support for integer cases like #7462 (comment). |
bbafd0d
to
01a771a
Compare
You can add perf tests for both the integer case and your case within one test, then we are at least tracking them, and any changes to it will show up. If the tests just take a few seconds, that's perfect for perf, as it won't really slow down the perf test suite |
Implementation wise this PR lgtm now, so we could merge it and add the perf test later if you prefer. We'd lose the visible improvement in the perf tests, but I'm not sure how much value that has. |
Perf PR is up at rust-lang/rustc-perf#769. |
The |
@bors r+ rollup=never |
📌 Commit 01a771a has been approved by |
☀️ Test successful - checks-actions, checks-azure |
Also removes the ugly caching that was introduced in rust-lang#76918. It was bolted on without deeper knowledge of the workings of the algorithm. This commit manages to be more performant without any of the complexity. It should be better on representative workloads too.
…ly2, r=varkor Clarify main code paths in exhaustiveness checking This PR massively clarifies the main code paths of exhaustiveness checking, by using the `Constructor` enum to a fuller extent. I've been itching to write it for more than a year, but the complexity of matching consts had prevented me. Behold a massive simplification :D. This in particular removes a fair amount of duplication between various parts, localizes code into methods of relevant types when applicable, makes some implicit assumptions explicit, and overall improves legibility a lot (or so I hope). Additionally, after my changes undoing rust-lang#76918 turned out to be a noticeable perf gain. As usual I tried my best to make the commits self-contained and easy to follow. I've also tried to keep the code well-commented, but I tend to forget how complex this file is; I'm happy to clarify things as needed. My measurements show good perf improvements on the two match-heavy benchmarks (-18.0% on `unicode_normalization-check`! :D); I'd like a perf run to check the overall impact. r? `@varkor` `@rustbot` modify labels: +A-exhaustiveness-checking
exhaustiveness: Rework constructor splitting `SplitWildcard` was pretty opaque. I replaced it with a more legible abstraction: `ConstructorSet` represents the set of constructors for patterns of a given type. This clarifies responsibilities: `ConstructorSet` handles one clear task, and diagnostic-related shenanigans can be done separately. I'm quite excited, I had has this in mind for years but could never quite introduce it. This opens up possibilities, including type-specific optimisations (like using a `FxHashSet` to collect enum variants, which had been [hackily attempted some years ago](rust-lang#76918)), my one-pass rewrite (rust-lang#116042), and future librarification.
exhaustiveness: Rework constructor splitting `SplitWildcard` was pretty opaque. I replaced it with a more legible abstraction: `ConstructorSet` represents the set of constructors for patterns of a given type. This clarifies responsibilities: `ConstructorSet` handles one clear task, and diagnostic-related shenanigans can be done separately. I'm quite excited, I had has this in mind for years but could never quite introduce it. This opens up possibilities, including type-specific optimisations (like using a `FxHashSet` to collect enum variants, which had been [hackily attempted some years ago](rust-lang#76918)), my one-pass rewrite (rust-lang#116042), and future librarification.
exhaustiveness: Rework constructor splitting `SplitWildcard` was pretty opaque. I replaced it with a more legible abstraction: `ConstructorSet` represents the set of constructors for patterns of a given type. This clarifies responsibilities: `ConstructorSet` handles one clear task, and diagnostic-related shenanigans can be done separately. I'm quite excited, I had has this in mind for years but could never quite introduce it. This opens up possibilities, including type-specific optimisations (like using a `FxHashSet` to collect enum variants, which had been [hackily attempted some years ago](rust-lang/rust#76918)), my one-pass rewrite (rust-lang/rust#116042), and future librarification.
This adds a fast path that would reduce the complexity to linear on matches consisting of only variant patterns (i.e. enum matches). (Also see: #7462) Unfortunately, I was too lazy to add a similar fast path for constants (mostly for integer matches), ideally that could be added another day.
TBH, I'm not confident with the performance claims due to the fact that enums tends to be small and FxHashMap could add a lot of overhead.
r? @Mark-Simulacrum
needs perf