Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

polling pipes instead of blocking reads is inefficient on linux #30

Closed
the8472 opened this issue Mar 1, 2021 · 2 comments · Fixed by #31
Closed

polling pipes instead of blocking reads is inefficient on linux #30

the8472 opened this issue Mar 1, 2021 · 2 comments · Fixed by #31

Comments

@the8472
Copy link
Member

the8472 commented Mar 1, 2021

The approach outlined in this comment

https://github.com/alexcrichton/jobserver-rs/blob/9d5e6da2157a3db4e60e276185292d4f65cdaf0d/src/unix.rs#L118-L133

probably is inefficient on linux. The kernel recently gained an optimization where a write to a pipe only wakes up one reader as long as that one reader empties the pipe: torvalds/linux@0ddad21 Note the wake_next_reader flag.

Waking up a polling process will never empty the pipe since it first has to return to userspace before that might issue a read syscall, So it'll end up waking more readers than necessary which leads to lots of unnecessary context switches and wasted CPU cycles.

It would be better to simply attempt reading from the fd and only start polling when it returns EWOULDBLOCK.

It might even be beneficial to dup() the file descriptor and explicitly set the copy to blocking mode if the current process has no need for non-blocking operation.

Edit: I included a lot of hedge words, but here's the word of god saying that poll-based wakeups indeed do not benefit from that optimization: https://lkml.org/lkml/2020/2/12/1019

@the8472 the8472 changed the title pipe read strategy may be inefficient polling pipes instead of blocking reads is inefficient on linux Mar 1, 2021
@alexcrichton
Copy link
Member

Yeah the thundering herd problem is one we've seen as an issue with performance work in the past with parallel rustc. Unfortunately fixing this for Linux won't fix the issue for macOS or other Unix-like platforms which exhibit the same poor performance. This was why we ended up concluding that Cargo would create a separate pipe for communicating with jobservers with all sub-spawned rustc's instead of sharing the same jobserver amongst Cargo and rustc.

If you're curious to use this outside of Cargo/rustc, however, then it seems trivial to at least fix this to read first. I don't think it will really help all that much if you want cross-platform support though since the fix is Linux-specific.

@the8472
Copy link
Member Author

the8472 commented Mar 2, 2021

I'm trying to parallelize rustc bootstrap by running multiple of its steps under a jobserver. But it's far from ready, so improving jobserver performance is not high on my list yet, I just wanted to bring it up early.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants