-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change several HashMaps to IndexMap to improve incremental hashing performance #90253
Conversation
r? @davidtwco (rust-highfive has picked a reviewer for you, use r? to override) |
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit 8587167dbd6bd39a9e6d8a5ca42d1192b5244c0e with merge 4438132761da5ec0c1b2b3e74658dd8454849393... |
☀️ Try build successful - checks-actions |
Queued 4438132761da5ec0c1b2b3e74658dd8454849393 with parent 56694b0, future comparison URL. |
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit 8587167dbd6bd39a9e6d8a5ca42d1192b5244c0e with merge 732539524eee94bddaed2893c2b83bc6bfa49d57... |
☀️ Try build successful - checks-actions |
Queued 732539524eee94bddaed2893c2b83bc6bfa49d57 with parent 84c2a85, future comparison URL. |
Finished benchmarking commit (732539524eee94bddaed2893c2b83bc6bfa49d57): comparison url. Summary: This change led to very large relevant mixed results 🤷 in compiler performance.
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR led to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, don't think we need to be too concerned about deterministic insertion order for these maps, so unless you want to add a comment, r=me.
@Mark-Simulacrum told me that we should tread lightly here. @Mark-Simulacrum do you want to add something? :) |
I was under the impression that StableHash needed to be deterministically produced in different builds with the "same" inputs to avoid bugs (or ICEs). This PR is changing a bunch of maps from Fx to the indexset variants, which removes the sort by stable hash key when stablehashing the maps, which I think means we need to otherwise ensure stability through a stable insertion order, right? Cc @rust-lang/wg-incr-comp |
I was asking myself something similar earlier. I have similar reservations. This PR will start tracking the order in which objects are processed. This order is saves in the dep-graph, but does not enter the computation of the query fingerprint for non-anonymous queries. For instance, in certain cases, like iterations on As a consequence, we may introduce spurious query invalidations (best case) or hash verification ICEs (worst case). (I take the example of LocalDefId ordering because it is known to have caused a few bugs, and because it is easily available incr. comp. unstable information.) I will think a bit more on it to try and find a solution... |
r? @cjgillot |
@cjgillot There is one more thing that I'm not sure how it works. I benchmarked the stable hashing of hash maps, and about half of the cost was invoking Now, by transforming I wonder, if the key implements
|
@Kobzol: the issue is about information flow into the stable hash. When iterating over a collection, be it a Vec, a HashMap or an IndexMap, the order of items influences the value of the resulting hash: Meanwhile, there is some information we do not want to track. This is the case of the value of My concern is about controlling this information flow. In order to do that Using There are two ways to resolve this concern:
In the long run, I think the second solution is the right way forward, but more tricky to reach. For instance, we do not really have a complete list of all untracked information. In conclusion, I think we should eventually merge this PR, but after for some progress on point 2 above, otherwise we may get incremental hash validation ICEs. However, I may be overly conservative here, since ICEs are still very unlikely. @Aaron1011 do you agree with this analysis? |
⌛ Testing commit e475a49 with merge c3d8803baf113e149eea23c0774e7744548fc1b9... |
💔 Test failed - checks-actions |
Hmm, not really sure what has happened. |
@bors retry |
⌛ Testing commit e475a49 with merge 1e8c1ed7b34eb2efa04bab53619c6298710b0366... |
💥 Test timed out |
@bors retry |
☀️ Test successful - checks-actions |
Finished benchmarking commit (c9b45e6): comparison url. Summary: This benchmark run shows 29 relevant improvements 🎉 but 31 relevant regressions 😿 to instruction counts.
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression |
Given that the performance seen here largely mirrors what was seen when running perf previously (large improvements to clap-rs and otherwise an overall performance wash), I'll mark this as triaged. Let me know if you disagree with this assessment. @rustbot triage label: perf-regression-triaged |
Stable hashing hash maps in incremental mode takes a lot of time, especially for some benchmarks like
clap
. As noted by @Mark-Simulacrum here, this cost could be reduced by replacing some hash maps by indexmaps.I gathered some statistics and found several hash maps that took a lot of time to hash and replaced them by indexmaps. However, in order for this to work, we need to make sure that these indexmaps have deterministic insertion order. These three are used only in visitors as far as I can see, which seems deterministic. Can we enforce this somehow? Or should some explaining comment be included for these maps?