-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up ptr::replace
#98197
Speed up ptr::replace
#98197
Conversation
Hey! It looks like you've submitted a new PR for the library teams! If this PR contains changes to any Examples of
|
r? @kennytm (rust-highfive has picked a reviewer for you, use r? to override) |
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit bc0788b with merge fca4bf64b483deb3f9c962caf2ddf63c79124b81... |
☀️ Try build successful - checks-actions |
Queued fca4bf64b483deb3f9c962caf2ddf63c79124b81 with parent 0887113, future comparison URL. |
Finished benchmarking commit (fca4bf64b483deb3f9c962caf2ddf63c79124b81): comparison url. Instruction count
Max RSS (memory usage)Results
CyclesResults
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Footnotes |
Looks good to me, but I'm a bit surprised with the benchmarking results. |
library/core/src/ptr/mod.rs
Outdated
// and cannot overlap `src` since `dst` must point to a distinct | ||
// allocated object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// and cannot overlap `src` since `dst` must point to a distinct | |
// allocated object. | |
// and cannot overlap `src` since `src` is a freshly stack-allocated local variable. |
Not that we even still need this part of the argument with mem::replace
.
It probably doesn't improve performance in rustc a lot (at all), because it barely seems to be used in the Rust repository. All uses of rust/library/std/src/sync/mpsc/oneshot.rs Line 182 in 7036449
rust/library/std/src/sync/mpsc/oneshot.rs Line 216 in 7036449
rust/library/std/src/sync/mpsc/oneshot.rs Line 295 in 7036449
rust/library/std/src/sync/mpsc/shared.rs Line 260 in 7036449
rust/library/std/src/sync/mpsc/stream.rs Line 164 in 7036449
Additionally I found one instance of rust/library/std/src/thread/local.rs Lines 812 to 831 in 7036449
|
@@ -811,7 +811,7 @@ mod lazy { | |||
|
|||
// SAFETY: | |||
// | |||
// note that this can in theory just be `*ptr = Some(value)`, but due to | |||
// Note that this can in theory just be `*ptr = Some(value)`, but due to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to what?
ping from triage: FYI: when a PR is ready for review, send a message containing @rustbot author |
Since this doesn't appear to increase performance at all, I'll just close this. |
There was a PR to improve it, but it was not merged for some reason. rust-lang/rust#98197
I noticed, that
ptr::replace
andmem::replace
are implemented differently, even though they should probably be the same.Looking at the goldbolt output, it seems that
mem::replace
does only 2memcpy
s andptr::replace
does 2memcpy
s with an inlined copy in between.I ran some benchmarks on my machine, in which
mem::replace
was more than 2x faster thanptr::replace
.Benchmark
Code
Output
In this PR i have replaced the implementation of
ptr::replace
to callmem::replace
instead ofmem::swap
.