-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking issue for refactoring the way we represent function call ABIs #119183
Comments
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Looks like fixing #117480 may be blocked on this. |
I've started work on cleaning up ABI code. The first few PRs are just going to be moving stuff around so that we can actually co-locate as much of the ABI code as possible:
I'll try to implement the MCP proper as I go along. |
Awesome. :-) If you want feedback on some sketches of what the ABI might look like, feel free to post them here. |
…bi, r=jieyouxu,compiler-errors compiler: Move `rustc_target::spec::abi::Abi` to `rustc_abi::ExternAbi` Lift `enum Abi` from its rather odd place in the middle of rustc_target, and make it available again from rustc_abi. You know, the crate where you would expect the enum that describes all the ABIs to be? The platform-neutral ones, at least. This will help further refactoring of how we handle ABIs in the near future[^0]. Rename `Abi` to `ExternAbi` because quite a lot of the compiler overloads the concept of "ABI" enough that the existing name is imprecise and it is often renamed _anyway_. Often this was to avoid conflicts with the *other* type formerly known as `Abi` (now named BackendRepr[^1]), but sometimes it is just for clarity, and this name seems more self-explanatory. It does get reexported, though, using its old name, to reduce the odds of merge-conflicting over the entire tree. All of `ExternAbi`'s friends come along for the ride, which costs adding some optional dependencies to the rustc_abi crate. However, all of this also allows simply moving three crates entirely off rustc_target: - rustc_hir_pretty - rustc_lint_defs - rustc_mir_build This odd selection is mostly to demonstrate a secondary motivation: The majority of the front-end of the compiler should be as target-agnostic as possible, and it is easier to assure this if they simply don't depend on the crate that describes targets. Note that I didn't migrate crates that don't benefit from it in this way yet, and I didn't survey every last crate. [^0]: This is being undertaken as part of rust-lang#119183 [^1]: rust-lang#132246
…bi, r=jieyouxu,compiler-errors compiler: Move `rustc_target::spec::abi::Abi` to `rustc_abi::ExternAbi` Lift `enum Abi` from its rather odd place in the middle of rustc_target, and make it available again from rustc_abi. You know, the crate where you would expect the enum that describes all the ABIs to be? The platform-neutral ones, at least. This will help further refactoring of how we handle ABIs in the near future[^0]. Rename `Abi` to `ExternAbi` because quite a lot of the compiler overloads the concept of "ABI" enough that the existing name is imprecise and it is often renamed _anyway_. Often this was to avoid conflicts with the *other* type formerly known as `Abi` (now named BackendRepr[^1]), but sometimes it is just for clarity, and this name seems more self-explanatory. It does get reexported, though, using its old name, to reduce the odds of merge-conflicting over the entire tree. All of `ExternAbi`'s friends come along for the ride, which costs adding some optional dependencies to the rustc_abi crate. However, all of this also allows simply moving three crates entirely off rustc_target: - rustc_hir_pretty - rustc_lint_defs - rustc_mir_build This odd selection is mostly to demonstrate a secondary motivation: The majority of the front-end of the compiler should be as target-agnostic as possible, and it is easier to assure this if they simply don't depend on the crate that describes targets. Note that I didn't migrate crates that don't benefit from it in this way yet, and I didn't survey every last crate. [^0]: This is being undertaken as part of rust-lang#119183 [^1]: rust-lang#132246
Rollup merge of rust-lang#132385 - workingjubilee:move-abi-to-rustc-abi, r=jieyouxu,compiler-errors compiler: Move `rustc_target::spec::abi::Abi` to `rustc_abi::ExternAbi` Lift `enum Abi` from its rather odd place in the middle of rustc_target, and make it available again from rustc_abi. You know, the crate where you would expect the enum that describes all the ABIs to be? The platform-neutral ones, at least. This will help further refactoring of how we handle ABIs in the near future[^0]. Rename `Abi` to `ExternAbi` because quite a lot of the compiler overloads the concept of "ABI" enough that the existing name is imprecise and it is often renamed _anyway_. Often this was to avoid conflicts with the *other* type formerly known as `Abi` (now named BackendRepr[^1]), but sometimes it is just for clarity, and this name seems more self-explanatory. It does get reexported, though, using its old name, to reduce the odds of merge-conflicting over the entire tree. All of `ExternAbi`'s friends come along for the ride, which costs adding some optional dependencies to the rustc_abi crate. However, all of this also allows simply moving three crates entirely off rustc_target: - rustc_hir_pretty - rustc_lint_defs - rustc_mir_build This odd selection is mostly to demonstrate a secondary motivation: The majority of the front-end of the compiler should be as target-agnostic as possible, and it is easier to assure this if they simply don't depend on the crate that describes targets. Note that I didn't migrate crates that don't benefit from it in this way yet, and I didn't survey every last crate. [^0]: This is being undertaken as part of rust-lang#119183 [^1]: rust-lang#132246
Looks like there is also some interest on the LLVM side to improve their ABI handling. Would be nice if we could benefit from that -- though we also have other backends, so maybe there's no way we can avoid having our own implementation of the C ABI... OTOH, our current |
cg_clif makes full use of it. In fact PassMode is at pretty much exactly the right level of abstraction as Cranelift needs (it doesn't support high level types and requires you to decompose everything into primitive values and for struct arguments a pointer argument with ArgumentPurpose::StructArg and for complex return types it requires you to pass a return area pointer. all of which PassMode makes pretty easy to do). The only problems I have with it are that LLVM silently accepts things that IMO shouldn't be accepted like |
Yeah, IMO our main ask from LLVM should be "more hard errors instead of silently making up some weird shit, please", with probably more IR parameter attributes to enable "please make up some weird shit on purpose". |
Unfortunately LLVM is moving in the opposite direction, see e.g. the discussion in <llvm/llvm-project#111334>
|
When these two land I will have more or less passed the "mere cleanup" phase:
There will always be more improvements (I have another set of diffs already, actually...) but now I can actually turn to fixing the real problems. Some scattered thoughts: There are higher-level and lower-level ways to represent the ABI. Devolving to register and stack passing, for instance, versus high-level abstractions like "pass this like it would be passed via the C ABI". The problem with using the C ABI as referent is the C ABI often has arbitrary limitations: For instance, multi-register returns are functionally inconceivable in many C ABIs, but are a normal idea for Rust. And because some C ABIs do implement them, LLVM often does not have an actual problem with doing them, they're just a bit weird to express using its C-like syntax. Going down to individual registers for all arguments might still be too low-level? Yet I definitely do not believe we should be reifying aggregates more in our ABI handling: thinking primarily in terms of them is almost inherently problematic. And we do think about registers a lot in our current handling. If we did think about this in the more lower-level form, we would need to account for how the translation of a set of arguments to an ABI handling would be inherently ordered: at some point you run out of registers and start putting things on the stack. I don't think that's completely unacceptable, but it does point to rethinking what we're doing fairly extensively. We already have at least should have a vague idea of what belongs in these two sets:
That's kind of what all the linting about target features points to, anyways. |
Some constraints we know about:
|
Doesn't that already sometimes happen, like soft float vs hard float? |
Nah, that doesn't happen because people compile code for targets and targets are not allowed to vary whether they are hard float or soft float. 😌 |
I think this is only somewhat true - there must be a "function pointer" ABI, but that ABI may want to differ from the ABI we choose to use for regular (non function pointer) calls, by (for example) inserting a shim when casting a function to a pointer. Effectively, we could model this as every function having a generic of whether it ever gets erased or is always called with knowledge of the specific function. That might be useful for example to leverage PGO or other information to optimize Result into passing the more common variant for a particular function through registers vs. not. That's commonly achieved through inlining, but I think it would be nice to avoid excluding it from being done without inlining too. |
I am aware of that but it seems functionally identical to generating a new A related concern that has been pointed out to me in the past is that sometimes, even with inlining, the codegen backend can have problems erasing the prologue and epilogue. Ideally, we would find a way to make "handling the ABI of this call" semantically separable from the actual motion of "making this call", so that we can delay addressing the need for a prologue and epilogue as long as possible. Notionally the function call ABI, after all, should be a non-event, as otherwise these functions aren't very functional. |
We currently talk a lot about registers in our ABI handling, but that's kind of a lie. All it means is "represent this to LLVM as a scalar / array of scalars" and then LLVM decides whether to put that into registers or on the stack depending on the target's conventions. (Well, there's a flag for forcing things to be in a register -- mixed up with other flags that are irrelevant for the ABI. I don't know the exact semantics of that, it's probably target-specific.) I don't think we should go any more low-level than this on the Rust side. I'd rather not have us be in the business of counting how many argument go in registers vs on the stack. That's a lot of work, very hard to test, and AFAIK none of our codegen backends actually give us that level of control anyway. The main goal of the original MCP was to make writing our target-specific foreign ABI adjustments easier and less fragile. For that code, the C ABI is exactly the right abstraction, as that's the level the ABI is specified at: we need to tell that code what the given type "looks like as a C type", and then the code needs to compute a corresponding
That kind of work needs to start in the backends though; LLVM / cranelift would require a higher-level way to express the ABI so that they don't require a bunch of instructions that must later be optimized away again (e.g. when inlining). This also points towards a rather higher-level than lower-level abstraction, i.e., not "register vs stack". |
LLVM does not implement the C ABI for us. |
Correct. We need to carry a function for each supported architecture (sometimes more than one per arch) that are given "the C view of the type" and compute the Regarding the point about multi-register returns, I don't see why we couldn't extend |
I think you are talking about a completely different layer than what I am talking about in the MCP and the issue description. We need two "ABI representations":
Currently we have effectively have a function I don't know where you went on a different track than my line of thoughts here but clearly we're not talking about the same thing, maybe this helps. :) |
It simply does not make any sense to me to talk about things in terms of C types if they do not have to follow the C ABI, so I think about these elements in a more decomposed way? |
I think I will simply talk less about these things because it is not very useful for me to try to record my thoughts in forms that are likely to be misunderstood. I understand what you are saying about your primary hope was to make the input forms more sensical, and I do intend to address that, I just am thinking about this from the origin of the demands... "we want programs that do FFI to successfully execute... which means certain things have to go into certain registers... which means that they have to..." etc. |
We need some sort of high-level language of types (well, higher-level than
That would make sense if we were emitting the assembly ourselves, but we are not. We are targeting the language of our backends in terms of how they represent ABI. |
Maybe I once again fell in the trap of "call ABI" meaning like a dozen different things to different people. For the purposes of this discussion, by "call ABI" I mean "the target-independent information that is necessary and sufficient to compute how arguments and returns values are passed between caller and callee". The perfect end state (that we may or may not ever reach) for this would be to say that two types are ABI compatible if and only if the computed call ABI information for them is the same -- that would be very nice for the spec and for MiniRust, anyway. The process of mapping that information to the concrete things you talk about (registers vs stack etc) is obviously highly target- and ABI-specific. But the input to that process seems to me to be expressible in a nice reasonably high-level target-specific way. |
In fact we can already represent 4 register returns on 64bit and 8 register returns on 32bit through
For lowering the C ABI we very much have to first map Rust types into C types as the C ABI is specified as a mapping from C types to registers and stack locations. Some of the ABI issues we have are precisely because the ABI implementation for each architecture does an implicit ad-hoc mapping from Rust to C types rather than sharing this such that we only have to get it right once. The Rust ABI follows a different code path and as such doesn't necessarily have to use this lowering to C types. It could keep directly operating on Rust types if we want. |
And we also have targets with two C-like ABIs. |
The Rust ABI follows a different code path and as such doesn't necessarily have to use this lowering to C types. It could keep directly operating on Rust types if we want.
Then we still have to get the ABI compat right twice, not the best plan IMO.
|
And we also have targets with two C-like ABIs.
That's just two different functions from AbiRepr to PassMode.
|
This is the issue tracking implementation of rust-lang/compiler-team#672. Note that we do not have a final design yet; the best way to represent call ABI, and to disentangle it from the "storage kind" of a type (which is what the
Abi
type currently largely represents) is yet to be determined.Note that for the purposes of this, by "call ABI" I mean "the target-independent information that is necessary and sufficient to compute how arguments and returns values are passed between caller and callee". The perfect end state (that we may or may not ever reach) for this would be to say that two types are ABI compatible if and only if the computed call ABI information for them is the same -- that would be very nice for the spec and for MiniRust, anyway.
I do not mean "the target-specific information saying which arguments are passed in which register / on the stack, which are copied and which are passed indirectly". This already exists to some extend as a concept in rustc, called
PassMode
. It may need some reforming, but that would be a separate discussion.repr(transparent)
in essentially every case. Then every target architecture must be ported to it:rustc_abi::Abi
toBackendRepr
#132246See in particular this comment.
Implementation history
rustc_target::abi
#131424{TyAnd,}Layout
comes home #131473rustc_abi::Abi
toBackendRepr
#132246rustc_target::spec::abi::Abi
torustc_abi::ExternAbi
#132385The text was updated successfully, but these errors were encountered: