-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove std::io::lazy::Lazy in favour of SyncOnceCell #77154
Conversation
r? @dtolnay (rust_highfive has picked a reviewer for you, use r? to override) |
The (internal) std::io::lazy::Lazy was used to lazily initialize the stdout and stdin buffers (and mutexes). It uses atexit() to register a destructor to flush the streams on exit, and mark the streams as 'closed'. Using the stream afterwards would result in a panic. Stdout uses a LineWriter which contains a BufWriter that will flush the buffer on drop. This one is important to be executed during shutdown, to make sure no buffered output is lost. It also forbids access to stdout afterwards, since the buffer is already flushed and gone. Stdin uses a BufReader, which does not implement Drop. It simply forgets any previously read data that was not read from the buffer yet. This means that in the case of stdin, the atexit() function's only effect is making stdin inaccessible to the program, such that later accesses result in a panic. This is uncessary, as it'd have been safe to access stdin during shutdown of the program. --- This change removes the entire io::lazy module in favour of SyncOnceCell. SyncOnceCell's fast path is much faster (a single atomic operation) than locking a sys_common::Mutex on every access like Lazy did. However, SyncOnceCell does not use atexit() to drop the contained object during shutdown. As noted above, this is not a problem for stdin. It simply means stdin is now usable during shutdown. The atexit() call for stdout is moved to the stdio module. Unlike the now-removed Lazy struct, SyncOnceCell does not have a 'gone and unusable' state that panics. Instead of adding this again, this simply replaces the buffer with one with zero capacity. This effectively flushes the old buffer *and* makes any writes afterwards pass through directly without touching a buffer, making print!() available during shutdown without panicking.
Two interesting minor differences before/after: 1: use std::{thread, io, mem, time::Duration};
fn main() {
print!("hello");
thread::spawn(|| mem::forget(io::stdout()));
thread::sleep(Duration::from_secs(1));
} Before this change, this outputs nothing: The Arc count does not drop back to 0, the stdout buffer is not dropped/flushed. 2: use std::{thread, io, mem, time::Duration};
fn main() {
print!("hello");
thread::spawn(|| mem::forget(io::stdout().lock()));
thread::sleep(Duration::from_secs(1));
} Before this change, this outputs (Adding a |
@rustbot modify labels: +T-libs +A-io |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks amazing to me. Thank you @m-ou-se!
To get some eyes on the externally observable behavior changes in #77154 (comment): |
Team member @dtolnay has proposed to merge this. The next step is review by the rest of the tagged team members: No concerns currently listed. Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
IMO all the externally visible differences are bug fixes. This is great! |
🔔 This is now entering its final comment period, as per the review above. 🔔 |
@bors r+ |
📌 Commit 6f9c132 has been approved by |
Maybe add comments explaining why this is ignored? |
Done. It might be the case that adding this in the test would also solve it: #![cfg_attr(target_os = "emscripten", feature(link_args))]
#[cfg(target_os = "emscripten")]
#[allow(unused_attributes)]
#[link_args = "-s EXIT_RUNTIME=1"]
extern "C" {} But I'm unable to test this locally. And I don't think this test really matters on emscripten. Edit: Tested it. Kind of works, but still fails: with EXIT_RUNTIME enabled Emscripten appends a newline by itself, failing the test. So I'll just leave the |
@bors r=dtolnay rollup=iffy |
📌 Commit 6b8b9c4 has been approved by |
☀️ Test successful - checks-actions, checks-azure |
…mutex-attr-cleanup, r=Mark-Simulacrum Remove unnecessary rustc_const_stable attributes. These attributes were added in rust-lang#74033 (comment) because of [std::io::lazy::Lazy::new](https://github.com/rust-lang/rust/blob/0c03aee8b81185d65b5821518661c30ecdb42de5/src/libstd/io/lazy.rs#L21-L23). But [std::io::lazy::Lazy is gone now](rust-lang#77154), so this can be cleaned up. @rustbot modify labels: +T-libs +C-cleanup
…mutex-attr-cleanup, r=Mark-Simulacrum Remove unnecessary rustc_const_stable attributes. These attributes were added in rust-lang#74033 (comment) because of [std::io::lazy::Lazy::new](https://github.com/rust-lang/rust/blob/0c03aee8b81185d65b5821518661c30ecdb42de5/src/libstd/io/lazy.rs#L21-L23). But [std::io::lazy::Lazy is gone now](rust-lang#77154), so this can be cleaned up. @rustbot modify labels: +T-libs +C-cleanup
…mutex-attr-cleanup, r=Mark-Simulacrum Remove unnecessary rustc_const_stable attributes. These attributes were added in rust-lang#74033 (comment) because of [std::io::lazy::Lazy::new](https://github.com/rust-lang/rust/blob/0c03aee8b81185d65b5821518661c30ecdb42de5/src/libstd/io/lazy.rs#L21-L23). But [std::io::lazy::Lazy is gone now](rust-lang#77154), so this can be cleaned up. @rustbot modify labels: +T-libs +C-cleanup
…=Mark-Simulacrum Enforce no-move rule of ReentrantMutex using Pin and fix UB in stdio A `sys_common::ReentrantMutex` may not be moved after initializing it with `.init()`. This was not enforced, but only stated as a requirement in the comments on the unsafe functions. This change enforces this no-moving rule using `Pin`, by changing `&self` to a `Pin` in the `init()` and `lock()` functions. This uncovered a bug I introduced in rust-lang#77154: stdio.rs (the only user of ReentrantMutex) called `init()` on its ReentrantMutexes while constructing them in the intializer of `SyncOnceCell::get_or_init`, which would move them afterwards. Interestingly, the ReentrantMutex unit tests already had the same bug, so this invalid usage has been tested on all (CI-tested) platforms for a long time. Apparently this doesn't break badly on any of the major platforms, but it does break the rules.\* To be able to keep using SyncOnceCell, this adds a `SyncOnceCell::get_or_init_pin` function, which makes it possible to work with pinned values inside a (pinned) SyncOnceCell. Whether this function should be public or not and what its exact behaviour and interface should be if it would be public is something I'd like to leave for a separate issue or PR. In this PR, this function is internal-only and marked with `pub(crate)`. \* Note: That bug is now included in 1.48, while this patch can only make it to ~~1.49~~ 1.50. We should consider the implications of 1.48 shipping with a wrong usage of `pthread_mutex_t` / `CRITICAL_SECTION` / .. which technically invokes UB according to their specification. The risk is very low, considering the objects are not 'used' (locked) before the move, and the ReentrantMutex unit tests have verified this works fine in practice. Edit: This has been backported and included in 1.48. And soon 1.49 too. --- In future changes, I want to push this usage of Pin further inside `sys` instead of only `sys_common`, and apply it to all 'unmovable' objects there (`Mutex`, `Condvar`, `RwLock`). Also, while `sys_common`'s mutexes and condvars are already taken care of by rust-lang#77147 and rust-lang#77648, its `RwLock` should still be made movable or get pinned.
The (internal) std::io::lazy::Lazy was used to lazily initialize the stdout and stdin buffers (and mutexes). It uses atexit() to register a destructor to flush the streams on exit, and mark the streams as 'closed'. Using the stream afterwards would result in a panic.
Stdout uses a LineWriter which contains a BufWriter that will flush the buffer on drop. This one is important to be executed during shutdown, to make sure no buffered output is lost. It also forbids access to stdout afterwards, since the buffer is already flushed and gone.
Stdin uses a BufReader, which does not implement Drop. It simply forgets any previously read data that was not read from the buffer yet. This means that in the case of stdin, the atexit() function's only effect is making stdin inaccessible to the program, such that later accesses result in a panic. This is uncessary, as it'd have been safe to access stdin during shutdown of the program.
This change removes the entire io::lazy module in favour of SyncOnceCell. SyncOnceCell's fast path is much faster (a single atomic operation) than locking a sys_common::Mutex on every access like Lazy did.
However, SyncOnceCell does not use atexit() to drop the contained object during shutdown.
As noted above, this is not a problem for stdin. It simply means stdin is now usable during shutdown.
The atexit() call for stdout is moved to the stdio module. Unlike the now-removed Lazy struct, SyncOnceCell does not have a 'gone and unusable' state that panics. Instead of adding this again, this simply replaces the buffer with one with zero capacity. This effectively flushes the old buffer and makes any writes afterwards pass through directly without touching a buffer, making print!() available during shutdown without panicking.
In addition, because the contents of the SyncOnceCell are no longer dropped, we can now use
&'static
instead ofArc
inStdout
andStdin
. This also saves two levels of indirection instdin()
andstdout()
, since Lazy effectively stored aBox<Arc<T>>
, and SyncOnceCell stores theT
directly.