-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't ignore generator fields in miri #60889
Comments
In Miri and in layout code, and for the same reason. |
Well, it gives us a "more correct" layout for generators, for some notion of correctness. Also technically we wouldn't have to do this for every field, if you know which fields can be partially initialized for a particular yield point you can encode that. And that might give us layout optimizations. |
Marking as "deferred" from a stabilization perspective. |
Generator optimization: Overlap locals that never have storage live at the same time The specific goal of this optimization is to optimize async fns which use `await!`. Notably, `await!` has an enclosing scope around the futures it awaits ([definition](https://github.com/rust-lang/rust/blob/08bfe16129b0621bc90184f8704523d4929695ef/src/libstd/macros.rs#L365-L381)), which we rely on to implement the optimization. More generally, the optimization allows overlapping the storage of some locals which are never storage-live at the same time. **We care about storage-liveness when computing the layout, because knowing a field is `StorageDead` is the only way to prove it will not be accessed, either directly or through a reference.** To determine whether we can overlap two locals in the generator layout, we look at whether they might *both* be `StorageLive` at any point in the MIR. We use the `MaybeStorageLive` dataflow analysis for this. We iterate over every location in the MIR, and build a bitset for each local of the locals it might potentially conflict with. Next, we assign every saved local to one or more variants. The variants correspond to suspension points, and we include the set of locals live across a given suspension point in the variant. (Note that we use liveness instead of storage-liveness here; this ensures that the local has actually been initialized in each variant it has been included in. If the local is not live across a suspension point, then it doesn't need to be included in that variant.). It's important to note that the variants are a "view" into our layout. For the layout computation, we use a simplified approach. 1. Start with the set of locals assigned to only one variant. The rest are disqualified. 2. For each pair of locals which may conflict *and are not assigned to the same variant*, we pick one local to disqualify from overlapping. Disqualified locals go into a non-overlapping "prefix" at the beginning of our layout. This means they always have space reserved for them. All the locals that are allowed to overlap in each variant are then laid out after this prefix, in the "overlap zone". So, if A and B were disqualified, and X, Y, and Z were all eligible for overlap, our generator might look something like this: You can think of a generator as an enum, where some fields are shared between variants. e.g. ```rust enum Generator { Unresumed, Poisoned, Returned, Suspend0(A, B, X), Suspend1(B), Suspend2(A, Y, Z), } ``` where every mention of `A` and `B` refer to the same field, which does not move when changing variants. Note that `A` and `B` would automatically be sent to the prefix in this example. Assuming that `X` is never `StorageLive` at the same time as either `Y` or `Z`, it would be allowed to overlap with them. Note that if two locals (`Y` and `Z` in this case) are assigned to the same variant in our generator, their memory would never overlap in the layout. Thus they can both be eligible for the overlapping section, even if they are storage-live at the same time. --- Depends on: - [x] #59897 Multi-variant layouts for generators - [x] #60840 Preserve local scopes in generator MIR - [x] #61373 Emit StorageDead along unwind paths for generators Before merging: - [x] ~Wrap the types of all generator fields in `MaybeUninitialized` in layout::ty::field~ (opened #60889) - [x] Make PR description more complete (e.g. explain why storage liveness is important and why we have to check every location) - [x] Clean up TODO - [x] Fix the layout code to enforce that the same field never moves around in the generator - [x] Add tests for async/await - [x] ~Reduce # bits we store by half, since the conflict relation is symmetric~ (note: decided not to do this, for simplicity) - [x] Store liveness information for each yield point in our `GeneratorLayout`, that way we can emit more useful debuginfo AND tell miri which fields are definitely initialized for a given variant (see discussion at #59897 (comment))
…ddyb Generator optimization: Overlap locals that never have storage live at the same time The specific goal of this optimization is to optimize async fns which use `await!`. Notably, `await!` has an enclosing scope around the futures it awaits ([definition](https://github.com/rust-lang/rust/blob/08bfe16129b0621bc90184f8704523d4929695ef/src/libstd/macros.rs#L365-L381)), which we rely on to implement the optimization. More generally, the optimization allows overlapping the storage of some locals which are never storage-live at the same time. **We care about storage-liveness when computing the layout, because knowing a field is `StorageDead` is the only way to prove it will not be accessed, either directly or through a reference.** To determine whether we can overlap two locals in the generator layout, we look at whether they might *both* be `StorageLive` at any point in the MIR. We use the `MaybeStorageLive` dataflow analysis for this. We iterate over every location in the MIR, and build a bitset for each local of the locals it might potentially conflict with. Next, we assign every saved local to one or more variants. The variants correspond to suspension points, and we include the set of locals live across a given suspension point in the variant. (Note that we use liveness instead of storage-liveness here; this ensures that the local has actually been initialized in each variant it has been included in. If the local is not live across a suspension point, then it doesn't need to be included in that variant.). It's important to note that the variants are a "view" into our layout. For the layout computation, we use a simplified approach. 1. Start with the set of locals assigned to only one variant. The rest are disqualified. 2. For each pair of locals which may conflict *and are not assigned to the same variant*, we pick one local to disqualify from overlapping. Disqualified locals go into a non-overlapping "prefix" at the beginning of our layout. This means they always have space reserved for them. All the locals that are allowed to overlap in each variant are then laid out after this prefix, in the "overlap zone". So, if A and B were disqualified, and X, Y, and Z were all eligible for overlap, our generator might look something like this: You can think of a generator as an enum, where some fields are shared between variants. e.g. ```rust enum Generator { Unresumed, Poisoned, Returned, Suspend0(A, B, X), Suspend1(B), Suspend2(A, Y, Z), } ``` where every mention of `A` and `B` refer to the same field, which does not move when changing variants. Note that `A` and `B` would automatically be sent to the prefix in this example. Assuming that `X` is never `StorageLive` at the same time as either `Y` or `Z`, it would be allowed to overlap with them. Note that if two locals (`Y` and `Z` in this case) are assigned to the same variant in our generator, their memory would never overlap in the layout. Thus they can both be eligible for the overlapping section, even if they are storage-live at the same time. --- Depends on: - [x] rust-lang#59897 Multi-variant layouts for generators - [x] rust-lang#60840 Preserve local scopes in generator MIR - [x] rust-lang#61373 Emit StorageDead along unwind paths for generators Before merging: - [x] ~Wrap the types of all generator fields in `MaybeUninitialized` in layout::ty::field~ (opened rust-lang#60889) - [x] Make PR description more complete (e.g. explain why storage liveness is important and why we have to check every location) - [x] Clean up TODO - [x] Fix the layout code to enforce that the same field never moves around in the generator - [x] Add tests for async/await - [x] ~Reduce # bits we store by half, since the conflict relation is symmetric~ (note: decided not to do this, for simplicity) - [x] Store liveness information for each yield point in our `GeneratorLayout`, that way we can emit more useful debuginfo AND tell miri which fields are definitely initialized for a given variant (see discussion at rust-lang#59897 (comment))
…ized, r=Centril Make use of possibly uninitialized data [E0381] a hard error This is one of the behaviors we no longer allow in NLL. Since it can lead to undefined behavior, I think it's definitely worth making it a hard error without waiting to turn off migration mode (rust-lang#58781). Closes rust-lang#60450. My ulterior motive here is making it impossible to leave variables partially initialized across a yield (see rust-lang#60889, discussion at rust-lang#63035), so tests are included for that. cc rust-lang#54987 --- I'm not sure if bypassing the buffer is a good way of doing this. We could also make a `force_errors_buffer` or similar that gets recombined with all the errors as they are emitted. But this is simpler and seems fine to me. r? @Centril cc @cramertj @nikomatsakis @pnkfelix @RalfJung
…ized, r=Centril Make use of possibly uninitialized data [E0381] a hard error This is one of the behaviors we no longer allow in NLL. Since it can lead to undefined behavior, I think it's definitely worth making it a hard error without waiting to turn off migration mode (rust-lang#58781). Closes rust-lang#60450. My ulterior motive here is making it impossible to leave variables partially initialized across a yield (see rust-lang#60889, discussion at rust-lang#63035), so tests are included for that. cc rust-lang#54987 --- I'm not sure if bypassing the buffer is a good way of doing this. We could also make a `force_errors_buffer` or similar that gets recombined with all the errors as they are emitted. But this is simpler and seems fine to me. r? @Centril cc @cramertj @nikomatsakis @pnkfelix @RalfJung
…ized, r=Centril Make use of possibly uninitialized data [E0381] a hard error This is one of the behaviors we no longer allow in NLL. Since it can lead to undefined behavior, I think it's definitely worth making it a hard error without waiting to turn off migration mode (rust-lang#58781). Closes rust-lang#60450. My ulterior motive here is making it impossible to leave variables partially initialized across a yield (see rust-lang#60889, discussion at rust-lang#63035), so tests are included for that. cc rust-lang#54987 --- I'm not sure if bypassing the buffer is a good way of doing this. We could also make a `force_errors_buffer` or similar that gets recombined with all the errors as they are emitted. But this is simpler and seems fine to me. r? @Centril cc @cramertj @nikomatsakis @pnkfelix @RalfJung
@RalfJung IIRC, you're basically relying on an implementation detail of how MIR gets built, which is that it doesn't leave partially-initialized aggregates in the generator prior to But if this changed, it would probably mean the layouts we report for generators are wrong. So it's probably best if this gets relied on and tested by miri (in addition to testing in rustc). Then if we change the way MIR builds aggregates, we'll have a test failure to let us know we have to wrap generator fields in Just to double check -- would this test fail if the aggregate were actually created? |
It would definitely fail if Miri's layout-guided recursive descend would ever actually hit a field of
So, in that test case, the generator does not preserve a |
I would not call this an implementation detail. If we were going to change this, we'd have to have some other modification to MIR to "signal" when the aggregate is fully constructed (it's relevant to borrow check, for example, and it affects visible drop order as well). Not to say we can't change it, but we can't do it lightly and without affecting other systems. |
Correct.
Okay, that makes me feel better then :) |
I suppose this means we can close this issue? I also extended the Miri test to try harder to catch invalid generator states by doing typed (and thus validated) copies of the entire generator after each yield (and then discarding them again, so this does not constitute "moving" the generator). |
Yep! That looks great, closing now. |
Right now, saved locals in generators are special-cased in miri, because they might not always be initialized.
With #59897 and #60187, we have the framework to fix this: fields are now only put in generator variants where they are live. The problem remains with aggregate fields, however, e.g.
In this case
x.1
is never initialized. This still presents a special case, because we cannot have a partially-unitialized field in a regular struct or enum type.One way to remove the special-casing in miri is to wrap the type of every generator field in
MaybeUninitialized
when we return it fromrustc::ty::layout::field()
. This required makingMaybeUnitialized
a lang item, though, and doesn't actually give us any new features.Eventually, I think we'd like to handle aggregate fields explicitly, so we can sanitize UB in generators.
The text was updated successfully, but these errors were encountered: