-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor WorkerLocal
for parallel compiler
#109478
Conversation
r? @cjgillot (rustbot has picked a reviewer for you, use r? to override) |
WorkerLocal
WorkerLocal
for parallel compiler
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
⌛ Trying commit 477e410 with merge fe80553693db52675c166e07a4e745050855ba41... |
// Safety: `inner` would never be accessed when multiple threads | ||
WorkerLocal { | ||
single_thread: false, | ||
inner: unsafe { MaybeUninit::uninit().assume_init() }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems dangerous...
I am not entirely sure if the semantics of this have been fixed yet, but reading https://github.com/rust-lang/unsafe-code-guidelines/blob/master/active_discussion/validity.md it seems like this would be UB under that (since the assignment does a typed copy at type T
, which does not allow uninit).
Either way, this is certainly a pattern that is discouraged, and it seems like the compiler should set an example here...
It seems much better to use a union
between inner
and mt_inner
here, since it is guaranteed to only access the right field. (Or even better, an enum, since single_thread
then functions as a discriminant... which basically makes it a homegrown enum anyways)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this is very unsound because rustc emits noundef
for almost all types. So this is immediately UB from LLVM's PoV, and the current thought for Rust rules (with no real thoughts on it not being UB in the future).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense. I am a little worried about the efficiency of using enum or union, maybe it is better to use inner: Option<T>
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite understand what the efficiency problem of an enum or union would be - checking the discriminant of an enum should be basically the same as checking if self.single_thread {
, right?
(Additionally, using an Option
would add increase the size by adding the Option
s discriminant in addition to single_thread
)
I have opened #109528 to test the performance of my suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using enum will prevent LLVM from making the best optimizations in many cases. For example, the perf result of this commit: #101566 (comment)
And the use of union will cause the compiler to add a lot of stuffs that trigger unwind due to union access errors, which will also reduce the optimization effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But Option
is also an enum, so it should have the same effect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use unwrap
which is a const function when signle_thread so i guess it is relatively more efficient
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (fe80553693db52675c166e07a4e745050855ba41): comparison URL. Overall result: ❌✅ regressions and improvements - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
|
Keep in mind that this needs to be an improvement with #107782 applied. |
I think we will finish this work this month |
close this as it's already landed |
part of #101566
This PR refactor
WorkerLocal
for parallel compiler, facilitating code review and perf test.ps. refactored WorkerLocal is not Send or Sync. It depends on #107586 to get thread safety.