-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do #[repr(Rust)] unions have internal padding? #354
Comments
It's also an LLVM lowering question, we might have to do some work to ensure that LLVM does not consider any |
Is there a scenario where it's an advantage to have internal padding within a union? |
It can provide niches. |
Padding can't provide niches because it's subject to clobbering. |
Technically one advantage is that padding allows calling convention optimizations. I also think it would be a bit weird and possibly uninituitive if overaligned unions with tail padding didn't treat it as proper padding? |
The extra work to tell LLVM there is no padding is "simple enough"; lower the union as if it had a variant which contains no padding. If we do allow interior padding in I'm torn on whether Footnotes
|
That's what we already do, based on the discussion in rust-lang/rust#97712 (comment) |
That implementation is interesting, as IIUC it's relying on The actual behavior of the pointer copy APIs is an orthogonal decision issue from whether unions have padding, though. Silly libs patch to avoid relying on non-padding of union in
|
@scottmcm yeah I don't see how that commit relates to how we lower unions. Currently we don't seem to be doing this right, this pub union U {
f: (u8, u16),
g: (),
}
pub fn u(u: U) {} becomes |
Ironically, making it |
No, it is actually correct for |
Oh I see. But anyway, for |
use std::mem::MaybeUninit;
unsafe fn print(x: MaybeUninit::<(u16, u8)>) {
let value: u32 = std::mem::transmute(x);
if value != 0xaabbccdd {
panic!("got {value:x}");
}
}
fn main() {
let mut x = MaybeUninit::<(u16, u8)>::uninit();
unsafe {
x.as_mut_ptr().cast::<u32>().write_unaligned(0xaabbccdd);
print(x);
}
} Funky, I didn't know that failed :) I'm assuming this isn't UB, and that it is indeed illegal for rust to zero the padding here, seeing as miri runs it perfectly fine. Worth me making an issue on rust-lang/rust about to track this? I can't seem to find an existing one. |
Just a note to make sure it's noted in the discussion: And (Tuples are However, Miri IIRC doesn't do any padding deinit at the moment, so Miri accepting it isn't necessarily an indicator that it's intended to be supported. |
Ah yes, I had entirely forgotten about that. :( And we need that for performance at least in some cases, like SIMD. But Rust pairs don't have a defined ABI, so we could maybe? use a single |
I think it would be a violation of the contract of repr(transparent) unless we changed the ABI of the pair. And I'm not a fan of that at all. |
Union |
IOW, it's (almost certainly) considered valid to link It could be allowed — roughly, |
|
The same effect can be achieved using |
I should note one of the reasons for |
@joshtriplett's (implicitly) made the case on #368 that whichever repr has preserve-all-bits semantics should preserve tail padding bits, not just interior padding bits. However, the preservation of tail padding has significant consequences on unions with highly aligned types. Consider |
I agree with Josh, I would find it rather surprising that tail padding is lost in a repr advertised as "preserving all bits". Copying 1 byte of a cacheline vs copying the entire cacheline is unlikely to make a big difference in practice, is it? Also if you want the speed benefit of dropping some of the bits, then maybe you shouldn't be requesting an preserving-all-bits representation? |
On Mon, Oct 31, 2022 at 14:32 Ralf Jung ***@***.***> wrote:
I agree with Josh, I would find it rather surprising that tail padding is
lost in a repr advertised as "preserving all bits".
Copying 1 byte of a cacheline vs copying the entire cacheline is unlikely
to make a big difference in practice, is it?
Not much aside from the icache waste of extra instructions, but you aren't
copying just 1 cacheline of mostly padding, You're copying a whole extra
cacheline of entirely padding.
… Also if you want the speed benefit of dropping some of the bits, then
maybe you shouldn't be requesting an preserving-all-bits representation?
—
Reply to this email directly, view it on GitHub
<#354 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABGLD25SCFJ23EOCSTYJ3NDWGAGETANCNFSM532P2WQQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
The reason I brought it up is that This has an impact on this discussion, because it means that FWIW, while "preserve all bits up to trailing padding" is more complicated than "preserve all bits," I don't think it necessarily impractically more complicated. This is in contrast to internal padding, which is much harder to define if/when it is preserved. It's perhaps also worth noting that eventually it'll be possible to layer "preserve all bits" semantics on top:#![feature(generic_const_exprs)]
use std::mem::{size_of, ManuallyDrop, MaybeUninit};
pub union PreserveAllBits<T>
where
[(); size_of::<T>()]:,
{
pub inner: ManuallyDrop<T>,
pub bytes: [MaybeUninit<u8>; size_of::<T>()],
} |
It's 1 byte of (potential) data and 63 bytes of padding. We can't omit that 1 byte copy anyway since there might be data in there. What's wrong with just using a |
I think we can have a recursive definition of what are the "padding bytes" of a type:
I wouldn't even make "trailing padding" a term that comes up in this specification. |
So here's an argument for This might even argue against preserving internal padding, depending on how the AM copy of a Rust Enum happens. A copy is necessarily a read, so there's nothing preventing the AM from only copying the "active variant" of a discriminated enum. (Disclaimer: I don't know what my answer to the question this poses is. However, it has made me at least more sympathetic to the option of |
For anyone curious, this is how |
Forked from #156. The question is specifically whether a union type has non-tail padding when every variant has padding at a particular byte, meaning that the padding could be clobbered. @RalfJung says no, because he would like unions to simply be
[byte; N]
. Nobody has made a case for any other answer, as far as I'm aware. But making a separate issue since it's a separate question about the#[repr(Rust)]
ABI and what we want to guarantee.The text was updated successfully, but these errors were encountered: