-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Box<dyn FnOnce> doesn't respect self alignment #68304
Comments
This is a soundness hole. |
So... I guess the problem is that the If this is correct, the issue should be renamed to something like "unsized locals do not respect alignment". |
A way to fix this would be to change the ABI for dynamic trait calls that take |
This can also be fixed by not allowing dynamically-sized local variables with an alignment more than some maximum value and having all dynamically-sized locals use that maximum alignment unless optimizations can prove that a smaller alignment suffices. |
Messing with the ABI wouldn't help with unsized locals in general (assuming @RalfJung is correct), since e.g.
Given only a |
Can we call |
In clang there's a function |
int main(void) {
int alignment = 16;
__builtin_alloca_with_align(16, alignment);
} |
I think that overallocating for unsized locals of unknown alignment may make sense. You can Sure, it increases stack requirements, but it seems acceptable for |
if we keep setting |
My idea was that rustc would prohibit coercing overaligned types to a dyn trait that takes #[repr(align(1048576))] // 1MiB
struct Overaligned(u8);
#[repr(align(16))]
struct Normal(u8);
trait TakeSelfByValue {
fn f(self) {
todo!()
}
}
impl TakeSelfByValue for Overaligned {}
impl TakeSelfByValue for Normal {}
fn make_overaligned() -> Box<dyn TakeSelfByValue> {
Box::new(Overaligned(0)) // errors due to too strict alignment
}
fn make_normal() -> Box<dyn TakeSelfByValue> {
Box::new(Normal(0)) // doesn't error -- alignment is under limit
} |
The alignment is dynamically determined though, not sure if saving 15 bytes of stack space is worth the conditional branch? |
Oh right. Yea that's probably not helpful. |
Since |
I was under the impression that calling |
Correctness shouldn't rely on optimizations to fire... |
@RalfJung Sure, I was just hoping we weren't relying on "unsized locals", only unsized parameters (i.e. I wonder if it would even be possible to restrict "unsized locals" to the subset where we can handle it entirely through ABI, and not actually have dynamic Ironically, if that works, it's also the subset we could've implemented years earlier than we did, since most of the complexity, unknowns (and apparently bugs) are around dynamic |
That should be possible. I think everything that is necessary is performing the deref of the box as argument ( |
@bjorn3 Hmm, but that's not trivially sound because it could lead to MIR calls where argument and return places overlap, which is UB. We'd potentially have to special-case it to Maybe we could introduce a temporary for the return value instead, for these kinds of virtual calls? That should remove any potential overlap while allowing the call. (Unrelatedly, I just realize we use a copying-out shim for vtable entries that take |
That would result in (I meant for this change to be done at mir construction level, not as an optimization pass, so borrowck should catch any soundness issues) |
Hmm, that's a good point, I think you can do that. Sigh. I knew that seemed too easy. I remember we used to try and prevent that, but we never did it fully, and in the move to NLL we loosened those rules back up, so this compiles: fn foo() {
let mut x = Box::new(vec![22]);
let y = take(*x);
*x = vec![44];
}
fn take<T>(x: T) -> T { x } Still, you could do my proposal for unsized types, though it would introduce an incongruity. UPDATE: Oh, wait, I remember now. For unsized types, reinitialization is not possible, because you can't do |
To be clear, I wasn't proposing changing anything apart from MIR building either, I don't think. IIUC, what you proposed is that let tmp1 = ..;
..
let tmpN = ..;
foo(*x, tmp1 .. tmpN) whereas under my proposal (limited to apply to unsized parameters, instead of moved parameters) would yield: let tmp0 = x;
...
let tmpN = ...;
foo(*tmp0, tmp1..tmpN) I think I mis-stated the problem earlier. It's not so much that your proposal will introduce compilation errors, it's that I think it changes the semantics of code like this in what I see as a surprising way: x.foo({ x = Box::new(...); ... }) I believe that this code would compile to the following under your proposal:
which means that Under my proposal, the code would still compile, but it would behave in what I see as the expected fashion. Specifically, it would generate:
|
Isn't there a way to "lock" that value through the borrow-checker, so further accesses would be disallowed, until the call returns? I'd rather be stricter here.
My point was that if your proposal affects sized moves, you're breaking stable reinitialization.
|
Let me just chime in here and say that this is a pain in Miri. Unsized locals are pretty different from sized locals, I had to introduce a bunch of special hacks to make them work. MIR locals even needed a new state just for them: rust/src/librustc_mir/interpret/eval_context.rs Lines 123 to 127 in 6050e52
Lazily allocating locals seems odd to me, but at least we can consistently do that for all locals. Elsewhere however we needed some special treatment to make this lazy initialization actually work out and catch mistakes where an unsized local is "overwritten" with a value of different size: rust/src/librustc_mir/interpret/place.rs Lines 891 to 901 in 6050e52
This all feels really unprincipled, and it would be great if at some point we could come up with a coherent semantics for unsized locals in MIR. |
I don't think we have anything like this today, but presumably we could add such a thing, but I'm not quite sure what you have in mind. When does this prohibition on further accesses take effect? In particular, the IR for
But the access to
Well, it depends on your perspective. It's true that, given the need to permit reinitialization, the check is now specific to unsized values -- i.e., MIR construction will be different depending on whether we know the type to be sized or not. That's unfortunate. However, I think we still maintain a degree of congruence, from an end-user perspective. In particular, users could not have reinitialized a However, under my proposal, users can write things like x.foo({x = ...; ...}) and the code works as expected (versus either having an unexpected result or getting an error). In any case, it seems like we're sort of "narrowing in" on a proposal here:
Does that sound right so far? |
If it's easier to do the more flexible approach (by moving the pointer) than the "locking" approach (which is a bit like borrowing the pointer until all arguments are evaluated, then releasing the borrow and doing a move instead), I don't mind it, I was only arguing with the "uniformity" aspect.
Something important here: your approach is stabilizable, while mine is meant more for that one |
Yes, I was kind of shooting for something that we could conceivably stabilize. |
Removing nomination, this was already discussed in triage meeting and it's under control. |
So should we maybe try out the approach I proposed and see how well it works? |
@nikomatsakis yes, going to try that out. |
For the record, the "new plan" is this. Change to MIR constructionWhen generating the arguments to a call, we ordinarily will make a temporary
But we will modify this such that, for arguments that meet the following conditions:
then we assign the place P (instead of This means that given
but now we generate
RationaleWe are now moving the box itself, which is slightly different than the semantics for values known to be sized. In particular, it does not permit "re-initialization" of the box after having moved its contents. But that is not supported for unsized types anyway. The change does permit modifications to the callee x.foo({ x = Box::new(...); ... }) we will find that we (a) save the old value of |
Update: There is also some discussion of implementation details on Zulip |
This needs MIR datastructure changes, right? It is not currently representable I think as fn arguments are |
|
Oh sorry, somehow I thought operands could only refer directly to locals, not also dereference them. I think I am beginning to see the pieces here -- together with @eddyb-style On the other hand this places a severe restriction on now unsized locals may be used, basically taking back large parts of rust-lang/rfcs#1909. Is there some long-term plan to bring that back (probably requires |
I would keep the more general feature but use two different feature-gates (if we can). And just like with a few other feature-gates ( We could also add runtime asserts that the (dynamic) alignment is respected, once we know |
I was wondering about that too. It seems like it'd be good to review the more general feature in any case. This is partly why I brought up the idea of wanting a guaranteed "no alloca" path -- this was something I remember us discussing long ago, and so in some sense I feel like the work we're doing here isn't just working around LLVM's lack of dynamic alignment support (though it is doing that, too) but also working to complete the feature as originally envisioned. |
I added two notes to the tracking issue #48055 so we don't overlook this in the future. |
This is probably expected, but the fix failed to actually make |
The following code fails on stable:
When compiled in release mode, the
"optimized out"
assert is optimized out by LLVM even though it checks the exact same condition as the other assert, as can be verified by testing in debug mode.Note that it's theoretically possible for the stack to be aligned correctly such that the bug is suppressed, but that's not likely. The 256's can be replaced with larger alignments if that happens.
The bug is caused by the impl of
Box<dyn FnOnce>
copyingself
to the stack with the default alignment of 16.(Playground)
The text was updated successfully, but these errors were encountered: