Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for abi-cafe failures #1525

Open
4 of 11 tasks
Tracked by #1543
bjorn3 opened this issue Aug 5, 2024 · 4 comments
Open
4 of 11 tasks
Tracked by #1543

Tracking issue for abi-cafe failures #1525

bjorn3 opened this issue Aug 5, 2024 · 4 comments
Labels
A-abi Area: ABI handling C-bug Category: This is a bug.

Comments

@bjorn3
Copy link
Member

bjorn3 commented Aug 5, 2024

https://github.com/rust-lang/rustc_codegen_cranelift/blob/master/patches/0002-abi-cafe-Disable-broken-tests.patch

@bjorn3 bjorn3 added C-bug Category: This is a bug. A-abi Area: ABI handling labels Aug 5, 2024
bors added a commit to rust-lang-ci/rust that referenced this issue Oct 3, 2024
Return values larger than 2 registers using a return area pointer

LLVM and Cranelift disagree about how to return values that don't fit in the registers designated for return values. LLVM will force the entire return value to be passed by return area pointer, while Cranelift will look at each IR level return value independently and decide to pass it in a register or not, which would result in the return value being passed partially in registers and partially through a return area pointer.

While Cranelift may need to be fixed as the LLVM behavior is generally more correct with respect to the surface language, forcing this behavior in rustc itself makes it easier for other backends to conform to the Rust ABI and for the C ABI rustc already handles this behavior anyway.

In addition LLVM's decision to pass the return value in registers or using a return area pointer depends on how exactly the return type is lowered to an LLVM IR type. For example `Option<u128>` can be lowered as `{ i128, i128 }` in which case the x86_64 backend would use a return area pointer, or it could be passed as `{ i32, i128 }` in which case the x86_64 backend would pass it in registers by taking advantage of an LLVM ABI extension that allows using 3 registers for the x86_64 sysv call conv rather than the officially specified 2 registers.

This adjustment is only necessary for the Rust ABI as for other ABI's the calling convention implementations in rustc_target already ensure any return value which doesn't fit in the available amount of return registers is passed in the right way for the current target.

Helps with rust-lang/rustc_codegen_cranelift#1525
cc bytecodealliance/wasmtime#9250
bors added a commit to rust-lang-ci/rust that referenced this issue Oct 10, 2024
Return values larger than 2 registers using a return area pointer

LLVM and Cranelift disagree about how to return values that don't fit in the registers designated for return values. LLVM will force the entire return value to be passed by return area pointer, while Cranelift will look at each IR level return value independently and decide to pass it in a register or not, which would result in the return value being passed partially in registers and partially through a return area pointer.

While Cranelift may need to be fixed as the LLVM behavior is generally more correct with respect to the surface language, forcing this behavior in rustc itself makes it easier for other backends to conform to the Rust ABI and for the C ABI rustc already handles this behavior anyway.

In addition LLVM's decision to pass the return value in registers or using a return area pointer depends on how exactly the return type is lowered to an LLVM IR type. For example `Option<u128>` can be lowered as `{ i128, i128 }` in which case the x86_64 backend would use a return area pointer, or it could be passed as `{ i32, i128 }` in which case the x86_64 backend would pass it in registers by taking advantage of an LLVM ABI extension that allows using 3 registers for the x86_64 sysv call conv rather than the officially specified 2 registers.

This adjustment is only necessary for the Rust ABI as for other ABI's the calling convention implementations in rustc_target already ensure any return value which doesn't fit in the available amount of return registers is passed in the right way for the current target.

Helps with rust-lang/rustc_codegen_cranelift#1525
cc bytecodealliance/wasmtime#9250
bors added a commit to rust-lang-ci/rust that referenced this issue Oct 14, 2024
Return values larger than 2 registers using a return area pointer

LLVM and Cranelift disagree about how to return values that don't fit in the registers designated for return values. LLVM will force the entire return value to be passed by return area pointer, while Cranelift will look at each IR level return value independently and decide to pass it in a register or not, which would result in the return value being passed partially in registers and partially through a return area pointer.

While Cranelift may need to be fixed as the LLVM behavior is generally more correct with respect to the surface language, forcing this behavior in rustc itself makes it easier for other backends to conform to the Rust ABI and for the C ABI rustc already handles this behavior anyway.

In addition LLVM's decision to pass the return value in registers or using a return area pointer depends on how exactly the return type is lowered to an LLVM IR type. For example `Option<u128>` can be lowered as `{ i128, i128 }` in which case the x86_64 backend would use a return area pointer, or it could be passed as `{ i32, i128 }` in which case the x86_64 backend would pass it in registers by taking advantage of an LLVM ABI extension that allows using 3 registers for the x86_64 sysv call conv rather than the officially specified 2 registers.

This adjustment is only necessary for the Rust ABI as for other ABI's the calling convention implementations in rustc_target already ensure any return value which doesn't fit in the available amount of return registers is passed in the right way for the current target.

Helps with rust-lang/rustc_codegen_cranelift#1525
cc bytecodealliance/wasmtime#9250
bors added a commit to rust-lang-ci/rust that referenced this issue Oct 17, 2024
…kic,jieyouxu

Return values larger than 2 registers using a return area pointer

LLVM and Cranelift disagree about how to return values that don't fit in the registers designated for return values. LLVM will force the entire return value to be passed by return area pointer, while Cranelift will look at each IR level return value independently and decide to pass it in a register or not, which would result in the return value being passed partially in registers and partially through a return area pointer.

While Cranelift may need to be fixed as the LLVM behavior is generally more correct with respect to the surface language, forcing this behavior in rustc itself makes it easier for other backends to conform to the Rust ABI and for the C ABI rustc already handles this behavior anyway.

In addition LLVM's decision to pass the return value in registers or using a return area pointer depends on how exactly the return type is lowered to an LLVM IR type. For example `Option<u128>` can be lowered as `{ i128, i128 }` in which case the x86_64 backend would use a return area pointer, or it could be passed as `{ i32, i128 }` in which case the x86_64 backend would pass it in registers by taking advantage of an LLVM ABI extension that allows using 3 registers for the x86_64 sysv call conv rather than the officially specified 2 registers.

This adjustment is only necessary for the Rust ABI as for other ABI's the calling convention implementations in rustc_target already ensure any return value which doesn't fit in the available amount of return registers is passed in the right way for the current target.

Helps with rust-lang/rustc_codegen_cranelift#1525
cc bytecodealliance/wasmtime#9250
bors added a commit to rust-lang-ci/rust that referenced this issue Oct 19, 2024
…kic,jieyouxu

Return values larger than 2 registers using a return area pointer

LLVM and Cranelift disagree about how to return values that don't fit in the registers designated for return values. LLVM will force the entire return value to be passed by return area pointer, while Cranelift will look at each IR level return value independently and decide to pass it in a register or not, which would result in the return value being passed partially in registers and partially through a return area pointer.

While Cranelift may need to be fixed as the LLVM behavior is generally more correct with respect to the surface language, forcing this behavior in rustc itself makes it easier for other backends to conform to the Rust ABI and for the C ABI rustc already handles this behavior anyway.

In addition LLVM's decision to pass the return value in registers or using a return area pointer depends on how exactly the return type is lowered to an LLVM IR type. For example `Option<u128>` can be lowered as `{ i128, i128 }` in which case the x86_64 backend would use a return area pointer, or it could be passed as `{ i32, i128 }` in which case the x86_64 backend would pass it in registers by taking advantage of an LLVM ABI extension that allows using 3 registers for the x86_64 sysv call conv rather than the officially specified 2 registers.

This adjustment is only necessary for the Rust ABI as for other ABI's the calling convention implementations in rustc_target already ensure any return value which doesn't fit in the available amount of return registers is passed in the right way for the current target.

Helps with rust-lang/rustc_codegen_cranelift#1525
cc bytecodealliance/wasmtime#9250
@bjorn3
Copy link
Member Author

bjorn3 commented Oct 19, 2024

rust-lang/rust#131211 which will be in next nightly fixed the OptionU128 issues on at least x86_64-unknown-linux-gnu and aarch64-unknown-linux-gnu and likely on macOS and Windows too.

github-actions bot pushed a commit to rust-lang/miri that referenced this issue Oct 20, 2024
Return values larger than 2 registers using a return area pointer

LLVM and Cranelift disagree about how to return values that don't fit in the registers designated for return values. LLVM will force the entire return value to be passed by return area pointer, while Cranelift will look at each IR level return value independently and decide to pass it in a register or not, which would result in the return value being passed partially in registers and partially through a return area pointer.

While Cranelift may need to be fixed as the LLVM behavior is generally more correct with respect to the surface language, forcing this behavior in rustc itself makes it easier for other backends to conform to the Rust ABI and for the C ABI rustc already handles this behavior anyway.

In addition LLVM's decision to pass the return value in registers or using a return area pointer depends on how exactly the return type is lowered to an LLVM IR type. For example `Option<u128>` can be lowered as `{ i128, i128 }` in which case the x86_64 backend would use a return area pointer, or it could be passed as `{ i32, i128 }` in which case the x86_64 backend would pass it in registers by taking advantage of an LLVM ABI extension that allows using 3 registers for the x86_64 sysv call conv rather than the officially specified 2 registers.

This adjustment is only necessary for the Rust ABI as for other ABI's the calling convention implementations in rustc_target already ensure any return value which doesn't fit in the available amount of return registers is passed in the right way for the current target.

Helps with rust-lang/rustc_codegen_cranelift#1525
cc bytecodealliance/wasmtime#9250
lnicola pushed a commit to lnicola/rust-analyzer that referenced this issue Oct 22, 2024
Return values larger than 2 registers using a return area pointer

LLVM and Cranelift disagree about how to return values that don't fit in the registers designated for return values. LLVM will force the entire return value to be passed by return area pointer, while Cranelift will look at each IR level return value independently and decide to pass it in a register or not, which would result in the return value being passed partially in registers and partially through a return area pointer.

While Cranelift may need to be fixed as the LLVM behavior is generally more correct with respect to the surface language, forcing this behavior in rustc itself makes it easier for other backends to conform to the Rust ABI and for the C ABI rustc already handles this behavior anyway.

In addition LLVM's decision to pass the return value in registers or using a return area pointer depends on how exactly the return type is lowered to an LLVM IR type. For example `Option<u128>` can be lowered as `{ i128, i128 }` in which case the x86_64 backend would use a return area pointer, or it could be passed as `{ i32, i128 }` in which case the x86_64 backend would pass it in registers by taking advantage of an LLVM ABI extension that allows using 3 registers for the x86_64 sysv call conv rather than the officially specified 2 registers.

This adjustment is only necessary for the Rust ABI as for other ABI's the calling convention implementations in rustc_target already ensure any return value which doesn't fit in the available amount of return registers is passed in the right way for the current target.

Helps with rust-lang/rustc_codegen_cranelift#1525
cc bytecodealliance/wasmtime#9250
@bjorn3
Copy link
Member Author

bjorn3 commented Oct 24, 2024

On aarch64-unknown-linux-gnu

extern "C" {
    fn write_val(val: f32);
}

#[repr(C)]
#[derive(Copy, Clone)]
struct F32Array {
    field0: [f32; 4],
}

#[no_mangle]
pub unsafe extern "C" fn val_in_3(_: F32Array, _: F32Array, arg2: F32Array) {
    unsafe {
        write_val(arg2.field0[0]);
        write_val(arg2.field0[1]);
        write_val(arg2.field0[2]);
        write_val(arg2.field0[3]);
    }
}

produces the following LLVM ir:

define void @val_in_3([4 x float] %0, [4 x float] %1, [4 x float] %2) unnamed_addr #0 !dbg !6 {
  %.fca.0.extract9 = extractvalue [4 x float] %2, 0
  %.fca.1.extract10 = extractvalue [4 x float] %2, 1
  %.fca.2.extract11 = extractvalue [4 x float] %2, 2
  %.fca.3.extract12 = extractvalue [4 x float] %2, 3
  tail call void @write_val(float noundef %.fca.0.extract9) #1, !dbg !11
  tail call void @write_val(float noundef %.fca.1.extract10) #1, !dbg !12
  tail call void @write_val(float noundef %.fca.2.extract11) #1, !dbg !13
  tail call void @write_val(float noundef %.fca.3.extract12) #1, !dbg !14
  ret void, !dbg !15
}

declare void @write_val(float noundef) unnamed_addr #0

and the following clif ir:

function u0:0(f32, f32, f32, f32, f32, f32, f32, f32, f32, f32, f32, f32) system_v {
; kind  loc.idx   param    pass mode                            ty
; zst   _0    ()                                0b 1, 8              align=8,offset=
; ret   _0      -          Ignore                               ()
; arg   _1      = v0,v1,v2,v3 Cast { pad_i32: false, cast: CastTarget { prefix: [None, None, None, None, None, None, None, None], rest: Uniform { unit: Reg { kind: Float, size: Size(4 bytes) }, total: Size(16 bytes), is_consecutive: true }, attrs: ArgAttributes { regular: , arg_ext: None, pointee_size: Size(0 bytes), pointee_align: None } } } F32Array
; arg   _2      = v4,v5,v6,v7 Cast { pad_i32: false, cast: CastTarget { prefix: [None, None, None, None, None, None, None, None], rest: Uniform { unit: Reg { kind: Float, size: Size(4 bytes) }, total: Size(16 bytes), is_consecutive: true }, attrs: ArgAttributes { regular: , arg_ext: None, pointee_size: Size(0 bytes), pointee_align: None } } } F32Array
; arg   _3      = v8,v9,v10,v11 Cast { pad_i32: false, cast: CastTarget { prefix: [None, None, None, None, None, None, None, None], rest: Uniform { unit: Reg { kind: Float, size: Size(4 bytes) }, total: Size(16 bytes), is_consecutive: true }, attrs: ArgAttributes { regular: , arg_ext: None, pointee_size: Size(0 bytes), pointee_align: None } } } F32Array

; ........................
}

I'm pretty sure that the issue is that the [f32; 4] gets splatted into individual arguments which get a register or stack location assigned independently, while the ABI requires them to be assigned to either consecutive registers or to the stack as a whole.

Edit: Opened bytecodealliance/wasmtime#9509

@bjorn3
Copy link
Member Author

bjorn3 commented Oct 24, 2024

Opened bytecodealliance/wasmtime#9510 to suggest removing the broken implicit return area pointer support of Cranelift to avoid others hitting the same issue.

@bjorn3
Copy link
Member Author

bjorn3 commented Jan 9, 2025

rust-lang/compiler-builtins#735 + rust-lang/rust#134338 will make the abi of __rust_u/i128_mulo less of a special case with respect to other 128bit int intrinsics on Windows. And rust-lang/rust#134290 will make 128bit ints get returned in xmm0 once it lands. This may be switched back in the future if clang changes too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-abi Area: ABI handling C-bug Category: This is a bug.
Projects
None yet
Development

No branches or pull requests

1 participant