Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infinite recursion in __extendhfsf2 and __truncsfhf2 on "no-f16-f128" platforms #651

Open
liushuyu opened this issue Aug 2, 2024 · 6 comments
Labels
llvm This issue needs to be fixed in LLVM

Comments

@liushuyu
Copy link

liushuyu commented Aug 2, 2024

I have encountered an infinite recursion when doing f16 arithmetics on "no-f16-f128" platforms due to questionable LLVM optimizations.

My initial investigation shows this might be due to LLVM being too clever and optimizing the soft-float absolute value conversion to a hard-float one:

define weak hidden noundef float @__extendhfsf2(half noundef %a) unnamed_addr #9 {
start:                                                                                                                                                                                       
%0 = tail call half @llvm.fabs.f16(half %a)                                                                                                                                                  
%_0.i7.i.i = bitcast half %0 to i16
%_0.i8.i.i = add nsw i16 %_0.i7.i.i, -1024
%_0.i5.i.i = icmp ult i16 %_0.i8.i.i, 30720
br i1 %_0.i5.i.i, label %bb8.i.i, label %bb14.i.i

; <snip...>
}

When lowering to the machine code, this causes the intrinsic function to call either itself or __truncsfhf2, and then __truncsfhf2 will call __extendhfsf2 again, forming an infinite recursion.

My current idea is to:

diff --git a/src/float/extend.rs b/src/float/extend.rs
index 5560489..10a0d61 100644
--- a/src/float/extend.rs
+++ b/src/float/extend.rs
@@ -32,7 +32,7 @@ where
 
     let sign_bits_delta = dst_sign_bits - src_sign_bits;
     let exp_bias_delta = dst_exp_bias - src_exp_bias;
-    let a_abs = a.repr() & src_abs_mask;
+    let a_abs = core::hint::black_box(a.repr()) & src_abs_mask;
     let mut abs_result = R::Int::ZERO;
 
     if a_abs.wrapping_sub(src_min_normal) < src_infinity.wrapping_sub(src_min_normal) {

... which will try to prevent LLVM from merging the absolute value masking into the @llvm.fabs.f16 LLVM intrinsic.
However, this idea introduces two extra memory operations (storing f16 and reading i16).

I did not open a pull request because I hope someone could come up with a much better idea.

@beetrees
Copy link
Contributor

beetrees commented Aug 2, 2024

Which target(s) have you encountered this on?

@beetrees
Copy link
Contributor

beetrees commented Aug 2, 2024

More specifically, I suspect this might be being caused by llvm/llvm-project#97981 (on several architectures LLVM currently passes f16s as f32s).

@liushuyu
Copy link
Author

liushuyu commented Aug 2, 2024

Which target(s) have you encountered this on?

I tried this on powerpc64le-unknown-linux-gnu

@tgross35
Copy link
Contributor

tgross35 commented Aug 2, 2024

I have encountered an infinite recursion when doing f16 arithmetics on "no-f16-f128" platforms due to questionable LLVM optimizations.

Did you discover this by trying to build compiler_builtins, or just by compiling a sample program in Rust? I don't think the latter is currently possible on ppc but just want to check.

@liushuyu
Copy link
Author

liushuyu commented Aug 2, 2024

I have encountered an infinite recursion when doing f16 arithmetics on "no-f16-f128" platforms due to questionable LLVM optimizations.

Did you discover this by trying to build compiler_builtins, or just by compiling a sample program in Rust? I don't think the latter is currently possible on ppc but just want to check.

I tried to build a sample program using the nightly Rust but with an updated version of compiler_builtins and LLVM 19

@tgross35
Copy link
Contributor

tgross35 commented Aug 2, 2024

Hm, I thought these symbols weren't available from Rust's compiler_builtins because of here https://github.com/rust-lang/rust/blob/53676730146e38e4697b6204c2ee61a9fd6b7e51/library/alloc/Cargo.toml#L15-L16

@tgross35 tgross35 added the llvm This issue needs to be fixed in LLVM label Aug 17, 2024
tgross35 added a commit to tgross35/compiler-builtins that referenced this issue Nov 5, 2024
CI in [1] seems to indicate that there are cases where the `f16`
infinite recursion bug ([2], [3]) can make its way into what gets called
during tests, even though this doesn't seem to be the usual case. In
order to make sure that we avoid these completely, just unset
`f16_enabled` on any platforms that have the recursion problem.

This also refactors the `match` statement to be more in line with
`library/std/build.rs`.

[1]: rust-lang#729
[2]: llvm/llvm-project#97981
[3]: rust-lang#651
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm This issue needs to be fixed in LLVM
Projects
None yet
Development

No branches or pull requests

3 participants