-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve codegen of String::retain method #96605
Conversation
Using unwrap_unchecked helps the optimizer to not generate panicking path, that will never be taken for valid UTF-8 like string. Using encode_utf8 saves us a call to a memcpy, as the optimizer is unable to realize that ch_len <= 4 and so can generate much better assembly code. https://rust.godbolt.org/z/z73ohenfc
Hey! It looks like you've submitted a new PR for the library teams! If this PR contains changes to any Examples of
|
r? @kennytm (rust-highfive has picked a reviewer for you, use r? to override) |
Wouldn't it be better to optimize for runs of retained/dropped chars so larger chunks could be memcopied? At least I assume real-world workloads rarely discard every second character. |
Maybe, but the cost of doing a |
I'll take over review for this since it's been a while since the last update. r? @thomcc Have you taken any benchmarks? |
Thanks. If you feel up to it you can also take #94647 which hasn't had any activity at all. 😁
Yes, I run them on
|
Previously it scaled with character counts and now it's flat, is that expected or do the benchmarks get optimized away entirely? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit surprising to me that this would make much of a perf difference, but it would also surprise me that it would hurt, there are benchmark numbers, and the logic that this avoids a memcpy call for something that is always <= 4 bytes all makes sense.
This also has the benefit of improving the safety documentation of these blocks, and actually reducing the amount of unsafe needed as well (sort of), so it seems like a positive change all around.
@bors r+ |
📌 Commit a98abe8 has been approved by |
This is a fair point. Hm. I actually think this is a reasonable change regardless, but it's worth seeing more accurate benchmark numbers. @bors r- |
Sorry.
Here is the same benchmark except that now the
|
You're fine. The issue should have more traction now anyway.
This looks reasonable to me, and since I was already inclined to accept the PR, @bors r+ |
📌 Commit a98abe8 has been approved by |
☀️ Test successful - checks-actions |
Finished benchmarking commit (4a86c79): comparison url. Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)Results
CyclesResults
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. @rustbot label: -perf-regression Footnotes |
This pull-request improve the codegen of the
String::retain
method.Using
unwrap_unchecked
helps the optimizer to not generate a panicking path that will never be taken for valid UTF-8 like string.Using
encode_utf8
saves us from an expensive call tomemcpy
, as the optimizer is unable to realize thatch_len <= 4
and so can generate much better assembly code.https://rust.godbolt.org/z/z73ohenfc