-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement unwrap_unchecked using transmutes when niche-optimizations are in play #102151
Conversation
r? @thomcc (rust-highfive has picked a reviewer for you, use r? to override) |
Hey! It looks like you've submitted a new PR for the library teams! If this PR contains changes to any Examples of
|
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit b852d54f74955dd93d08f60cb1d5f31fc17d1104 with merge a4bad50fb7be69e1d11952a70824ba9949d48ad7... |
013a7f9
to
b3ca318
Compare
@bors try |
⌛ Trying commit b3ca318d62aa09f0e6ce1e234b4932a98d53d724 with merge 3c5bba5b33e8613c13885b42641ada47186c9186... |
☀️ Try build successful - checks-actions |
1 similar comment
☀️ Try build successful - checks-actions |
Queued 3c5bba5b33e8613c13885b42641ada47186c9186 with parent e7119a0, future comparison URL. |
Finished benchmarking commit (3c5bba5b33e8613c13885b42641ada47186c9186): comparison URL. Overall result: ❌ regressions - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Footnotes |
Hm, very surprising that this would have any overhead. I kind of suspect it's an artifact of the compilation process, but who knows. |
Maybe it's just that the MIR for it is already pretty good. Essentially it's just
That memcpy is roughly the same to LLCM as the transmute-copy, for a niched layout, and noticing that it doesn't need the condition at all and dropping unnecessary code is already pretty cheap for LLVM. And rust ends up emitting all the discriminant calculation LLVM-IR anyway, since folding the Maybe that could be improved by making it |
library/core/src/option.rs
Outdated
// SAFETY: Size equality implies niches are involved. And with niches | ||
// transmutes are ok because they don't change bits, only make use of invalid values | ||
unsafe { | ||
let val = mem::transmute_copy(&self); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
YMMV: if you want to save the separate forget
call, I think you can write this as
let val = mem::transmute_copy(&self); | |
return mem::transmute_copy(&ManuallyDrop::new(self)); |
(Since forget
is just putting it in a ManuallyDrop
and ignoring it these days anyway.)
b3ca318
to
7157cfe
Compare
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit 7157cfe with merge 24860e60db7c91164ed67469532d69c3ab700541... |
But the LLVM-IR contains an |
☀️ Try build successful - checks-actions |
Queued 24860e60db7c91164ed67469532d69c3ab700541 with parent 9a963e3, future comparison URL. |
Finished benchmarking commit (24860e60db7c91164ed67469532d69c3ab700541): comparison URL. Overall result: ❌ regressions - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Footnotes |
compile-time losses are consistent with the previous run, but so are binary-size changes (especially in opt-full builds) and it is spending more time in LLVM, so it's having an effect on the optimizer, just not the one expected. I'll take a look at the generated assembly maybe they're diffable. |
No description provided.