-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite mpsc::shared #42883
Rewrite mpsc::shared #42883
Conversation
Previous version had too much state: `cnt`, `steals`, `port_dropped` fields, and it's too hard to consistently update all of them during upgrade. I tried to fix issue rust-lang#39364, but there are too many corner cases. In this version all of these fields removed, and shared state is basically managed by two fields: `queue` and `to_wait`. `to_wake` field now stores both `SignalToken` and "disconnected" flag. All tests still pass, and bug rust-lang#39364 no longer reproduces. Patch includes a couple of stress tests. This version should have the same performance characteristics, because roughly the same number of atomics and wait/notify operations involved.
(rust_highfive has picked a reviewer for you, use r? to override) |
I've pinged @alexcrichton, but I suspect he may not get to this until next week. |
Thanks for the ping, I'd definitely like to look at this but I won't be able to get to this until next week at the earliest, sorry for the delay! |
Sorry this is still in my queue, I haven't forgotten about this. I predict this will take a significant amount of time and energy to review, so I'm trying to make sure I'm ready for that. If others would like to review in the meantime that's also of course ok! |
friendly ping to save up some time and energy, @alexcrichton ! |
Unfortunately still haven't had the time to look into this. I'd welcome any others from @rust-lang/libs to help take a look if possible though. @stepancheg if you can also help review by detailing and changes and/or providing an architectural overview that would also be helpful. |
r? @aturon @stepancheg I wasn't quite able to tell from your PR description -- are you saying this fixes #39364, or that you were trying to fix it and it no longer reproduces, but you're unsure? In general, as @alexcrichton mentioned reviewing this kind of code is extremely expensive, so we've tended not to take changes unless they are strongly motivated. |
@aturon yes it fixes the issue #39364. The logic is different now, so that problem is very likely gone now, but I'm not 100% sure that I haven't introduced another bugs. As I said, before this patch code seems to be too complicated, and I believe there's no simple fix for that issue (I understood what was the problem, but couldn't code the fix; to be more precise I fixed it in my local copy, but I couldn't convince myself that it was correct, and I reverted it and instead made this patch). I'll try to write more more explanations of how patch works (a little later). And if anyone's have any specific questions about the patch (what does it do, and why some code is written how it is written), or point to missing docs in code, I'll gladly explain. |
@aturon do you wish to wait for the further explanation from @stepancheg before doing your review? Otherwise, this is your friendly 3-day ping for review. |
Yes, I'd appreciate getting more of an overview. |
@stepancheg have you had a chance to work on the explanation to help guide review yet? Just checking in! |
🕸 This PR had been inactive for 2 weeks. Closing it to keep our queue clean. Feel free to reopen. 🕸 |
is this going to make a comeback ? #39364 makes |
`concurrent_recv_timeout_and_upgrade` reproduces a problem 100% times on my MacBook with command: ``` ./x.py test --stage 0 ./src/test/run-pass/mpsc_stress.rs ``` Thus it is commented out. Other tests cases were useful for catching another test cases which may arise during the fix. This diff is a part of my previous rewrite attempt: rust-lang#42883 CC rust-lang#39364
Stress test for MPSC `concurrent_recv_timeout_and_upgrade` reproduces a problem 100% times on my MacBook with command: ``` ./x.py test --stage 0 ./src/test/run-pass/mpsc_stress.rs ``` Thus it is commented out. Other tests cases were useful for catching another test cases which may arise during the fix. This diff is a part of my previous rewrite attempt: rust-lang#42883 CC rust-lang#39364
Stress test for MPSC `concurrent_recv_timeout_and_upgrade` reproduces a problem 100% times on my MacBook with command: ``` ./x.py test --stage 0 ./src/test/run-pass/mpsc_stress.rs ``` Thus it is commented out. Other tests cases were useful for catching another test cases which may arise during the fix. This diff is a part of my previous rewrite attempt: rust-lang#42883 CC rust-lang#39364
Stress test for MPSC `concurrent_recv_timeout_and_upgrade` reproduces a problem 100% times on my MacBook with command: ``` ./x.py test --stage 0 ./src/test/run-pass/mpsc_stress.rs ``` Thus it is commented out. Other tests cases were useful for catching another test cases which may arise during the fix. This diff is a part of my previous rewrite attempt: rust-lang#42883 CC rust-lang#39364
Stress test for MPSC `concurrent_recv_timeout_and_upgrade` reproduces a problem 100% times on my MacBook with command: ``` ./x.py test --stage 0 ./src/test/run-pass/mpsc_stress.rs ``` Thus it is commented out. Other tests cases were useful for catching another test cases which may arise during the fix. This diff is a part of my previous rewrite attempt: rust-lang#42883 CC rust-lang#39364
Previous version had too much state:
cnt
,steals
,port_dropped
fields, and it's too hard to consistently update all of them during
upgrade. I tried to fix issue #39364, but there are too many corner
cases. In this version all of these fields removed, and shared
state is basically managed by two fields:
queue
andto_wait
.to_wake
field now stores bothSignalToken
and "disconnected"flag.
All tests still pass, and bug #39364 no longer reproduces.
Patch includes a couple of stress tests.
This version should have the same performance characteristics,
because roughly the same number of atomics and wait/notify operations
involved.