Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

repr(C) does not always match the current target's C toolchain (when that target is windows-msvc) #521

Open
RalfJung opened this issue Aug 4, 2024 · 7 comments

Comments

@RalfJung
Copy link
Member

RalfJung commented Aug 4, 2024

There are some cases where the layout we generate for repr(C) types does not match the current target's C "default" toolchain. The ones I am aware of are all cases where we allow type declarations that go beyond standard C, and we match the behavior of GCC/clang, but that does not match MSVC:

In many cases part of the problem is that different toolchains behave differently, so in Rust we have to make the choice between "layout that is portable across many targets" and "layout that matches the current target's C toolchain". However, given that the name of the repr in C and given the crABI project to define a portable ABI, it seems most reasonable to say that repr(C) should match whatever C does on the given target. That is also what the lang team decided (but not FCP'd as far as I can see). But of course that would be a breaking change...

These issues have been stuck for a while, and I am not sure what is the best way to make progress. Maybe we should have a post-mono lint saying "hey you defined this repr(C) type but on the current target that doesn't actually behave the way C does, so behavior is planned change in the future"? (It has to be post-mono in general because with generics, it's often not possible to detect pre-mono whether a type falls into this category or not.)

@ChrisDenton
Copy link
Member

However, given that the name of the repr in C and given the crABI project to define a portable ABI, it seems most reasonable to say that repr(C) should match whatever C does on the given target.

Is it really possible to make progress here unless some form of crABI is stable? Without that what are the options? Breaking changes without an alternative? We obviously can't have one ABI that's both portable and target specific.

@RalfJung
Copy link
Member Author

RalfJung commented Aug 4, 2024

Without that what are the options? Breaking changes without an alternative?

A start might be to at least lint about these cases (and then cratering to see how common they are).
But yeah, if we want to change behavior for MSVC targets here there wouldn't be a way to get back the old behavior.

You know the MSVC ecosystem a lot better than me -- how much do these problems affect real code today? What do programmers targeting MSVC do to deal with this? How likely is it that any code running on MSVC actually wants the current behavior, i.e., would any of the code being "broken" actually consider this breakage?

@ChrisDenton
Copy link
Member

I mean, to put it in perspective the linked issues are edge cases. MSVC does emit warnings for zero sized arrays and overflowing enums (which are treated similarly to overflowing ints):

  • C4200 nonstandard extension used: zero-sized array in struct/union
  • C4309 'conversion' : truncation of constant value
  • C4369 'enumerator' : enumerator value 'value' cannot be represented as 'type', value is 'new_value'

However, a warning isn't emitted for empty structs and also SIMD types like __m128, when used in packed structs, override the packing and force the struct to be 16-byte aligned.

Tbh, I don't think I've seen empty structs being used in C interfaces (though I could be forgetting something) and using SIMD types in packed structs is very rare. I think the biggest issue is when C code using zero sized arrays is ported from GCC to MSVC and happens to work despite the differences (it is at least internally consistent). Then a Rust program tries to interop and those differences suddenly matter.

All these issues can be worked around if you know about them. For example, the empty struct or zero sized array cases can use a [T; 1] array to emulate the correct layout.


Personally I think changing the ZST case in Rust is the most likely to be breaking (either arrays or structs). People using repr(C) just to have a defined Rust layout are not going to like ZST differences between platforms. The other issues I don't feel particularly concerned about.

@ChrisDenton
Copy link
Member

However, a warning isn't emitted for empty structs

Oh right, an empty struct is an error in MSVC C (C2016) but is allowed in MSVC C++. Tbh I do tend to compile code as C++ more often than just C, even if the code itself is C.

@steveklabnik
Copy link
Member

What do programmers targeting MSVC do to deal with this? How likely is it that any code running on MSVC actually wants the current behavior, i.e., would any of the code being "broken" actually consider this breakage?

I ran with the MSVC target as my primary target from like, 2015ish until a few months ago. In the ecosystem, in practice, I never ran into problems with this stuff, or at least, not to my knowledge. Of course, this mostly was for FFI stuff, and I am only one person, who wasn't doing a ton of C <-> Rust interop.

That doesn't mean that I think change shouldn't happen here, for whatever that's worth. In fact, I guess what I'm saying is, this stuff is edge-case enough that you could probably "break" it without anyone noticing.

@chorman0773
Copy link
Contributor

Oh right, an empty struct is an error in MSVC C (C2016) but is allowed in MSVC C++. Tbh I do tend to compile code as C++ more often than just C, even if the code itself is C.

It's notably valid in C++ in general (hense being allowed without warning), though it has a padding byte.

@ChrisDenton
Copy link
Member

To sum up my thoughts:

  • I'm in favour of breaking changes for C FFI to match the system C but
  • Breaking everyone using #[repr(C)] with a ZST that's not intended to be part of the ABI (e.g. PhantomData) is not a good option, unless perhaps there is an alternative (even then it may lead to a lot of churn)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants