Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The sparc64, riscv64, loongarch64 extern "C" fn ABIs are all wrong when aligned/packed structs are involved #115609

Open
RalfJung opened this issue Sep 6, 2023 · 24 comments
Labels
A-ABI Area: Concerning the application binary interface (ABI) A-FFI Area: Foreign function interface (FFI) I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness O-loongarch Target: LoongArch (LA32R, LA32S, LA64) O-riscv Target: RISC-V architecture O-SPARC Target: SPARC processors P-medium Medium priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@RalfJung
Copy link
Member

RalfJung commented Sep 6, 2023

All of these ABIs have in common that they are looking at layout.fields to compute the ABI for the function, and they are all getting it wrong, in various different ways. (I previously reported bugs against all of these because repr(transparent) is not properly respected; those bugs have the same cause -- relying on layout.fields -- but they are orthogonal.)

For instance, this type

#[repr(packed)]
struct P(i8, f32);

on riscv64 gets a PassMode::Cast of

                           prefix: [
                               Some(
                                   Reg {
                                       kind: Integer,
                                       size: Size(1 bytes),
                                   },
                               ),
                               None,
                               None,
                               None,
                               None,
                               None,
                               None,
                               None,
                           ],
                           rest: Uniform {
                               unit: Reg {
                                   kind: Float,
                                   size: Size(4 bytes),
                               },
                               total: Size(4 bytes),
                           },

which gets translated to the LLVM type { i8, f32 }, and that's a non-packed LLVM struct. On a call, P gets transmuted to that LLVM type, which obviously produces garbage since the fields are at different offsets.

On sparc64, the type translates into { i32, f32 } which is wrong as well.

As another example, this type

#[repr(align(8))]
struct Aligned8Int(i32);

#[repr(C)]
struct S2(Aligned8Int, f32);

on loongarch64 gets a PassMode::Cast of

                           prefix: [
                               Some(
                                   Reg {
                                       kind: Integer,
                                       size: Size(4 bytes),
                                   },
                               ),
                               None,
                               None,
                               None,
                               None,
                               None,
                               None,
                               None,
                           ],
                           rest: Uniform {
                               unit: Reg {
                                   kind: Float,
                                   size: Size(4 bytes),
                               },
                               total: Size(4 bytes),
                           },


which is the LLVM type { i32, f32 }, and that's obviously not the right type for S2.

(Caveat: I can't actually run code on these targets, so there's a small chance that I am completely misinterpreting what these PassMode mean. I hope that's not the case...)

I don't know what the C ABI on those platforms has to say about packed and aligned structs. Currently, we don't even provide ABIs a way to generate a packed LLVM struct for the arguments -- PassMode::Cast always uses an un-packed LLVM struct.

@RalfJung RalfJung added O-SPARC Target: SPARC processors O-riscv Target: RISC-V architecture A-ABI Area: Concerning the application binary interface (ABI) O-loongarch Target: LoongArch (LA32R, LA32S, LA64) labels Sep 6, 2023
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Sep 6, 2023
@bjorn3
Copy link
Member

bjorn3 commented Sep 6, 2023

Looks like clang will split

struct P {
    int8_t a;
    float b;
} __packed__;

into two arguments on rv64gc with i8 and float respectively as types. Also I think clang will tell LLVM that the struct is packed if necessary on various targets. I also found this function: https://github.com/llvm/llvm-project/blob/d1487670ee1caaf1762c341f9f96cfb07540b5c1/clang/include/clang/CodeGen/CGFunctionInfo.h#L247-L250

@RalfJung
Copy link
Member Author

RalfJung commented Sep 6, 2023

into two arguments

So that's <{ i8, f32 }>? Or really two separate arguments to the function -- like Rust's ScalarPair? Or does that not even make a difference for the ABI?

@bjorn3
Copy link
Member

bjorn3 commented Sep 6, 2023

In this case it seemed like two separate args were emitted. AFAIK LLVM will internally recursively split structs into separate arguments anyway though if byval is not used. At least that is what it does for Webassembly.

@heiher
Copy link
Contributor

heiher commented Sep 7, 2023

The #[repr(packed)] attribute explicitly defines and constrains the memory layout for types, while the extern "C" for functions requires the function's calling convention to be the same as in the C language. On some architectures with only 32-bit and 64-bit register widths, separating the passing of i8 and f32 can improve access performance. So, as long as parameter passing matches the C calling convention of the target architecture, it is correct. Isn't it?

@bjorn3
Copy link
Member

bjorn3 commented Sep 7, 2023

The issue is with how we turn the value as represented in memory into an argument that can be passed in registers. Currently we assume that the value as represented in memory and the llvm type we use for argument passing have the same layout, which is not the case for aligned/packed structs as the llvm type has a different amount of padding from the real type.

@RalfJung
Copy link
Member Author

RalfJung commented Sep 7, 2023

The #[repr(packed)] attribute explicitly defines and constrains the memory layout for types, while the extern "C" for functions requires the function's calling convention to be the same as in the C language.

The issue is: when I declare a packed type in C, and a packed type in Rust, then they should be ABI-compatible. But currently they are not. If you take the type P from my example, declare an equivalent type in C, have a C function that takes a struct P as argument, and call it from Rust, then things will explode in the most amazing ways.

@heiher
Copy link
Contributor

heiher commented Sep 7, 2023

Thanks for your explanation. ❤️ I tried the test cases you mentioned, and indeed there is an issue when accessing packed structure fields through memory addresses. If I didn't make any mistakes, is x86_64 also wrong?

t.rs:

#[repr(packed, C)]
struct P {
    i: i8,
    f: f32,
}

#[no_mangle]
extern "C" fn test(p: P) {
    let ip = std::ptr::addr_of!(p.i);
    let fp = std::ptr::addr_of!(p.f);
    let i = unsafe { std::ptr::read_unaligned(ip) };
    let f = unsafe { std::ptr::read_unaligned(fp) };
    println!("{:p} {:p} {} {}", ip, fp, i, f);
}

t.c:

struct P
{
    char i;
    float f;
} __attribute__((packed));

extern void test(struct P p);

int
main (int argc, char *argv[])
{
    struct P p;
    
    p.i = 16;
    p.f = 128.0;

    test(p);

    return 0;
}

run:

rustc --crate-type=cdylib -o libt.so t.rs; gcc -o t t.c -L . -l t; LD_LIBRARY_PATH=`pwd` ./t

x86_64 with #[repr(packed)] and __attribute__((packed)):

0x7ffc31f02520 0x7ffc31f02521 16 128

x86_64 without #[repr(packed)] and __attribute__((packed)):

0x7ffc5a60eaa0 0x7ffc5a60eaa4 16 128

loongarch64 with #[repr(packed)] and __attribute__((packed)):

0x7ffffbf10720 0x7ffffbf10721 16 0.000000000000000000000000000000000000023509886

loongarch64 without #[repr(packed)] and __attribute__((packed)):

0x7ffffba12f90 0x7ffffba12f94 16 128

@RalfJung
Copy link
Member Author

RalfJung commented Sep 7, 2023

Interesting. x86_64 uses indirect pass mode for packed structs, so... that can't really be an ABI difference. Not sure what is happening. Do the field offsets match on both sides?

@heiher

This comment was marked as outdated.

@RalfJung
Copy link
Member Author

RalfJung commented Sep 8, 2023

I can't read assembly so I can't quite figure out what that gdb snippet at the end tells me.
Could you open a new issue for the x86_64 problem? It's different in nature from this one. This issue leads to data loss even when passing packed values between two extern "C" Rust functions (the ABI implemented by Rust doesn't even work for pure Rust programs); the x86_64 problem is an incompatibility with the C ABI (that only surfaces when linking Rust and C code).

@heiher
Copy link
Contributor

heiher commented Sep 12, 2023

For this type:

#[repr(packed, C)]
struct P {
    i: i8,
    f: f32,
}

#[no_mangle]
extern "C" fn test(p: P) {
    let ip = std::ptr::addr_of!(p.i);
    let fp = std::ptr::addr_of!(p.f);
    println!("{:p} {:p}", ip, fp);
}

The callee creates a temporary variable on the stack with the same type and alignment as the parameter, called var1. On LoongArch, the argument passed in is not same as rust type, it is not packed. A temporary variable of the same rust type is also created, called var2. The value of the parameter will be written directly to temporary var1. Finally, use llvm.memcpy to copy var1 to var2.

The issue is that var1 and var2 have different field layout and total size. So it seems that for packed case we need to copy field by field?

fn store(
&self,
bx: &mut Builder<'_, 'll, 'tcx>,
val: &'ll Value,
dst: PlaceRef<'tcx, &'ll Value>,
) {
if self.is_ignore() {
return;
}
if self.is_sized_indirect() {
OperandValue::Ref(val, None, self.layout.align.abi).store(bx, dst)
} else if self.is_unsized_indirect() {
bug!("unsized `ArgAbi` must be handled through `store_fn_arg`");
} else if let PassMode::Cast(cast, _) = &self.mode {
// FIXME(eddyb): Figure out when the simpler Store is safe, clang
// uses it for i16 -> {i8, i8}, but not for i24 -> {i8, i8, i8}.
let can_store_through_cast_ptr = false;
if can_store_through_cast_ptr {
bx.store(val, dst.llval, self.layout.align.abi);
} else {
// The actual return type is a struct, but the ABI
// adaptation code has cast it into some scalar type. The
// code that follows is the only reliable way I have
// found to do a transform like i64 -> {i32,i32}.
// Basically we dump the data onto the stack then memcpy it.
//
// Other approaches I tried:
// - Casting rust ret pointer to the foreign type and using Store
// is (a) unsafe if size of foreign type > size of rust type and
// (b) runs afoul of strict aliasing rules, yielding invalid
// assembly under -O (specifically, the store gets removed).
// - Truncating foreign type to correct integral type and then
// bitcasting to the struct type yields invalid cast errors.
// We instead thus allocate some scratch space...
let scratch_size = cast.size(bx);
let scratch_align = cast.align(bx);
let llscratch = bx.alloca(cast.llvm_type(bx), scratch_align);
bx.lifetime_start(llscratch, scratch_size);
// ... where we first store the value...
bx.store(val, llscratch, scratch_align);
// ... and then memcpy it to the intended destination.
bx.memcpy(
dst.llval,
self.layout.align.abi,
llscratch,
scratch_align,
bx.const_usize(self.layout.size.bytes()),
MemFlags::empty(),
);
bx.lifetime_end(llscratch, scratch_size);
}
} else {
OperandValue::Immediate(val).store(bx, dst);
}
}

@RalfJung
Copy link
Member Author

The first question is, what even is the C ABI for packed types on this target, how do they get passed? Is it indirect, or are the fields spread across registers, and if yes how?

Only after answering that question then can we try to figure out how to encode that ABI in rustc + LLVM.

@heiher
Copy link
Contributor

heiher commented Sep 13, 2023

The first question is, what even is the C ABI for packed types on this target, how do they get passed?

@RalfJung That's a good point.

For the current 'packed' instance, which consists of two fields (one integer and the other floating-point), there is a unique case to consider. In the context of the calling convention of LoongArch, it's important to note that integers are passed via GAR (General Argument Register) while floating-point values are passed via FAR (Floating-point Argument Register).

I'm not sure how other ISAs handle this. From a professional Rust perspective, what approach do you think would be more suitable to start with?

@RalfJung
Copy link
Member Author

So, this passing in registers applies even for packed structs? Sadly that document you linked doesn't mention packed structs at all.

I think this means that we want to generate an LLVM function with type <{ i8, float }> (the angle brackets indicate a packed struct in LLVM). But this would then rely on the LLVM backend logic passing such an argument in registers for this target -- that's code I know absolutely nothing about. Cc @bjorn3 @nikic

If this hypothesis is correct, then we need to extend rustc CastTy to get a packed: bool field, which needs to be used when the CastTy is turned into an LLVM type (that part is easy), and which needs to be set as appropriate by ABI adjustments (that part is hard).

However, even with that we get into trouble for more complicated types, such as

#[repr(packed)]
struct P(i8, f32);

#[repr(C)]
struct Arg(i8, i32, P);

The document seems to say that this is passed the same as #[repr(C)] (i8, i32, i8, f32) in term of registers, but of course the padding is different. I have no idea how to express that in LLVM. Do we need nested types in CastTy, like { i8, i32, <{ i8, float }> }?

This is a concern even without packed; { i8, { i8, i16 } } is not the same as { i8, i8, i16 }. I'm not sure what the plan was with rustc's CastTy here, how that is supposed to be able to map such nested cases.

@nikic
Copy link
Contributor

nikic commented Sep 14, 2023

Is the problematic case https://godbolt.org/z/jWEoncaP6? Specifically the fact that the load is used on a non-packed struct? (Packed / non-packed shouldn't matter for the call itself.)

I think we should generally avoid passing struct arguments on the LLVM IR level (only use structs for pair returns) -- I believe these will always get scalarized anyway, and just directly passing these in scalarized form is more amenable to optimization and makes the ABI more obvious.

@RalfJung
Copy link
Member Author

I think we should generally avoid passing struct arguments on the LLVM IR level (only use structs for pair returns) -- I believe these will always get scalarized anyway, and just directly passing these in scalarized form is more amenable to optimization and makes the ABI more obvious.

So, you're saying if the ABI says "pass this struct in a float register and an int register", then we should generate two LLVM IR arguments, of type float and i32 respectively?

That makes a lot of sense. However, that would be a significant departure from our current argument handling, and requires a fundamental refactor of CastTy (and really the entire argument handling story, which is all quite entangled currently -- a lot of places assume they know how many LLVM arguments each Rust argument corresponds to).

@bjorn3
Copy link
Member

bjorn3 commented Sep 14, 2023

So, you're saying if the ABI says "pass this struct in a float register and an int register", then we should generate two LLVM IR arguments, of type float and i32 respectively?

Using {float, i32} as argument type should be enough for that, right?

@RalfJung
Copy link
Member Author

RalfJung commented Sep 14, 2023

It might be, but that's not helpful for this issue since we cannot transmute from the packed type to that struct type.

I was very surprised by CastTy since it contains a bunch of registers but doesn't say how to splat the argument value across those registers. (Also see #115654.) I think we need an alternative representation that's more like a list of "take N bytes at offset O and put this into a register of type T". Once we have that (maybe it should be PassMode::Spread/PassMode::Split instead of PassMode::Cast then), we have multiple options for how to encode that to LLVM:

  • as a single, non-nested struct type
  • as separate arguments

Those might be equivalent on the ABI side? I don't know. Either way we need to emit the appropriate code to deconstruct and reconstruct the Rust value on both ends.

For returns, I think we'll have to use struct types in LLVM function signatures anyway. So we might as well do it everywhere?

@heiher
Copy link
Contributor

heiher commented Sep 14, 2023

So, this passing in registers applies even for packed structs? Sadly that document you linked doesn't mention packed structs at

Sorry for late. The rules for structures also apply to "packed".

@Noratrieb Noratrieb removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Sep 19, 2023
@RalfJung
Copy link
Member Author

@nikic is there any documentation of how the LLVM ABI works when LLVM functions take structs as arguments (and return type)?

Looks like attributes such as zeroext can only be applied to individual parameters, so if we need to set those for fields of structs passed in registers, we'll need to use separate arguments. But then what do we do with ScalarPair return values? There's a comment in the code currently that we do want to use LLVM pairs for that, I assume that leads to more efficient return value passing (via two registers, or so?).

@Jules-Bertholet
Copy link
Contributor

@rustbot label I-unsound

@rustbot rustbot added I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Oct 17, 2023
@apiraino
Copy link
Contributor

WG-prioritization assigning priority (Zulip discussion).

@rustbot label -I-prioritize +P-medium

@rustbot rustbot added P-medium Medium priority and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Oct 18, 2023
@apiraino apiraino added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Oct 18, 2023
@workingjubilee
Copy link
Member

workingjubilee commented Oct 5, 2024

cc @kito-cheng @michaelmaitland @robin-randhawa-sifive Please triage this issue. It would be nice to confirm whether this is fixed or not for RISCV.

@workingjubilee workingjubilee added the A-FFI Area: Foreign function interface (FFI) label Nov 1, 2024
@SpriteOvO
Copy link
Contributor

SpriteOvO commented Nov 18, 2024

I can confirm that the code in #115609 (comment) (with #[repr(packed)] and __attribute__((packed))) still produces wrong output on RISC-V 64 with Rust v1.82.

0x3fd239f7f0 0x3fd239f7f1 16 0.000000000000000000000000000000000000004396431

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ABI Area: Concerning the application binary interface (ABI) A-FFI Area: Foreign function interface (FFI) I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness O-loongarch Target: LoongArch (LA32R, LA32S, LA64) O-riscv Target: RISC-V architecture O-SPARC Target: SPARC processors P-medium Medium priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

10 participants