Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue: 32bit x86 targets without SSE2 have unsound floating point behavior #114479

Open
RalfJung opened this issue Aug 4, 2023 · 35 comments
Labels
A-floating-point Area: Floating point numbers and arithmetic C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC I-miscompile Issue: Correct Rust code lowers to incorrect machine code I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness O-x86_32 Target: x86 processors, 32 bit (like i686-*) P-medium Medium priority

Comments

@RalfJung
Copy link
Member

RalfJung commented Aug 4, 2023

On x86 (32bit) targets that cannot use SSE2 instructions (this includes the tier 1 i686 targets with flags that disable SSE2 support, such as -C target-cpu=pentium), floating-point operation can return results that are rounded in different ways than they should, and results can be "inconsistent": depending on whether const-propagation happened, the same computation can produce different results, leading to a program that seemingly contradicts itself. This is caused by using x87 instructions to perform floating-point arithmetic, which do not accurately implement IEEE floating-point semantics (not with the right precision, anyway). The test tests/ui/numbers-arithmetic/issue-105626.rs has an example of such a problem.

Worse, LLVM can use x87 register to store values it thinks are floats, which resets the signaling bit and thus alters the value -- leading to miscompilations.

This is an LLVM bug: rustc is generating LLVM IR with the intended semantics, but LLVM does not compile that code in the way that the LLVM LangRef describes. This is a known and long-standing problem, and very hard to fix. The affected targets are so niche these days that that is nobody's priority. The purpose of this issue mostly is to document its existence and to give it a URL that can be referenced.

Some ideas that have been floated for fixing this problem:

  • We could emit different instruction sequences for floating-point operations that emulate the expected rounding behavior using x87 instructions. This will likely require changes deep in LLVM's x86 backend. This is what Java does.
  • We could use softfloats.
  • We could set the FPU control register to 64bit precision for Rust programs, and require other code to set the register in that way before calling into a Rust library. this does not work

Related issues:

Prior issues:

@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Aug 4, 2023
@RalfJung RalfJung added C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC O-x86 A-floating-point Area: Floating point numbers and arithmetic and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Aug 4, 2023
@DemiMarie
Copy link
Contributor

Can we just drop support for x86 without SSE2 or fall back to software floating point?

@RalfJung
Copy link
Member Author

RalfJung commented Aug 4, 2023

Can we just drop support for x86 without SSE2

We currently have the following targets in that category:

  • i586-pc-windows-msvc
  • i586-unknown-linux-gnu
  • i586-unknown-linux-musl

They are all tier 2. I assume people added them for a reason so I doubt they will be happy about having them dropped. Not sure to what extent floating point support is needed on those targets, but I don't think we have a concept of "target without FP support".

or fall back to software floating point?

In theory of course we can, not sure if that would be any easier than following the Java approach.

@programmerjake
Copy link
Member

programmerjake commented Aug 4, 2023

or fall back to software floating point?

In theory of course we can, not sure if that would be any easier than following the Java approach.

it should be much easier since LLVM already supports that, unlike the Java scheme (afaik)

https://gcc.godbolt.org/z/14WdnKhhP

@RalfJung
Copy link
Member Author

FWIW f32 on x86-32-noSSE should actually be fine, since double-rounding is okay as long as the precision gap between the two modes is big enough. Only f64 has a problem since it is "too close" to the 80bit-precision of the x87 FPU.

On another note, we even have code in the standard library that temporarily alters the x87 FPU control word to ensure exact 64bit precision...

@RalfJung RalfJung changed the title Tracking issue: 32bit x86 targets have non-compliant floating point behavior Tracking issue: 32bit x86 targets without SSE have non-compliant floating point behavior Sep 5, 2023
@RalfJung RalfJung changed the title Tracking issue: 32bit x86 targets without SSE have non-compliant floating point behavior Tracking issue: 32bit x86 targets without SSE2 have non-compliant floating point behavior Sep 5, 2023
@RalfJung
Copy link
Member Author

RalfJung commented Sep 5, 2023

I wonder if there's something that could be done about the fact that this also affects tier 1 targets with custom (stable) flags such as -C target-cpu=pentium. Do we still want to consider that a tier 1 target in itself? Is there a way that we can just reject flags that would disable SSE2 support, and tell people to use a different target instead?

@DemiMarie
Copy link
Contributor

@RalfJung What about forcing non-SSE2 targets to software floating point? That must be supported anyway because of kernel code.

@RalfJung
Copy link
Member Author

RalfJung commented Sep 5, 2023

Yeah that's an option listed above, since you already proposed it before. I have no idea how feasible it is. People are very concerned about the softfloat support for f16/f128 leading to code bloat and whatnot, so the same concerns would likely also apply here.

AFAIK the kernel code just doesn't use floats, I don't think they have softfloats?

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Oct 3, 2023
…bilee

add notes about non-compliant FP behavior on 32bit x86 targets

Based on ton of prior discussion (see all the issues linked from rust-lang/unsafe-code-guidelines#237), the consensus seems to be that these targets are simply cursed and we cannot implement the desired semantics for them. I hope I properly understood what exactly the extent of the curse is here, let's make sure people with more in-depth FP knowledge take a close look!

In particular for the tier 3 targets I have no clue which target is affected by which particular variant of the x86_32 FP curse. I assumed that `i686` meant SSE is used so the "floating point return value" is the only problem, while everything lower (`i586`, `i386`) meant x87 is used.

I opened rust-lang#114479 to concisely describe and track the issue.

Cc `@workingjubilee` `@thomcc` `@chorman0773`  `@rust-lang/opsem`
Fixes rust-lang#73288
Fixes rust-lang#72327
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Oct 3, 2023
Rollup merge of rust-lang#113053 - RalfJung:x86_32-float, r=workingjubilee

add notes about non-compliant FP behavior on 32bit x86 targets

Based on ton of prior discussion (see all the issues linked from rust-lang/unsafe-code-guidelines#237), the consensus seems to be that these targets are simply cursed and we cannot implement the desired semantics for them. I hope I properly understood what exactly the extent of the curse is here, let's make sure people with more in-depth FP knowledge take a close look!

In particular for the tier 3 targets I have no clue which target is affected by which particular variant of the x86_32 FP curse. I assumed that `i686` meant SSE is used so the "floating point return value" is the only problem, while everything lower (`i586`, `i386`) meant x87 is used.

I opened rust-lang#114479 to concisely describe and track the issue.

Cc `@workingjubilee` `@thomcc` `@chorman0773`  `@rust-lang/opsem`
Fixes rust-lang#73288
Fixes rust-lang#72327
@Noratrieb Noratrieb added O-x86_32 Target: x86 processors, 32 bit (like i686-*) and removed O-x86-all labels Oct 25, 2023
@keith-packard
Copy link

Does Rust expose the difference between representation and evaluation method? In C, FLT_EVAL_METHOD` is used by applications to detect platforms on which the computations are performed at a higher precision than the underlying value representations.

This doesn't completely solve the problem as 8087 ABI also does representation conversion across function call boundaries so passing (or returning) a sNaN will have that transformed into qNaN and an exception.

It's a stretch to claim that the 8087 FPU has 'unsound' floating point behavior; it's all quite compliant with the IEEE 754 specification, if you squint hard enough. Changing the ABI to pass float/double without a representation change would resolve the worst of the problems I've found, and that's not a hardware issue.

@beetrees
Copy link
Contributor

beetrees commented Jul 4, 2024

In Rust, all operations are individually evaluated at the precision of the type being used (which I believe is the equivalent of FLT_EVAL_METHOD 0) using the default round to nearest, ties to even rounding mode (except for methods which explicitly document a different rounding mode like .round()). The basic +/-/*// operators are guaranteed to give correctly rounded results, and methods which have an unspecified precision are documented as such.

The unsoundness is not just theoretical; the LLVM IR Rust compiles f32 and f64 operations to has the desired Rust semantics, whereas the LLVM non-SSE x86 backend compiles that IR to machine code that violates the semantics of the IR. This means e.g. LLVM optimisations that (correctly) operate presuming the LLVM IR semantics with regards to evaluation precision can cause the LLVM non-SSE x86 backend introduce out-of-bounds reads/writes in safe code (see this earlier comment for a code example). The NaN quietening issue also violates the semantics of LLVM IR and can cause the emitted binary to mutate the value of non-float types (see this earlier comment for the code sample, details in this comment).

Because this is a miscompilation at the LLVM IR -> machine code stage, as opposed to the Rust -> LLVM IR stage, miscompilations can occur in other programming languages that use LLVM as a codegen backend. For example, llvm/llvm-project#89885 contains an example of a miscompilation from C. Ultimately what matters are the semantics of the LLVM IR; not everything that is permitted by the IEEE 754 specification is permitted by LLVM IR (and vice versa).

The return value ABI issue is tracked separately in #115567, and affects all 32-bit x86 targets, not just those with SSE/SSE2 disabled. It is possible to manually load/store a f32/f64 signalling NaN to/from an x87 register without quietening it (see e.g. the code in #115567 (comment)), but currently neither LLVM nor Rust do so. The "Rust" ABI (which doesn't have any stability guarantees) is being changed to avoid x87 registers completely in #123351.

@keith-packard
Copy link

Yeah, it sounds like rust cannot support other values for FLT_EVAL_METHOD, and so doesn't support these kinds of corner cases in the IEEE 754 spec. I suspect Rust and the x87 FPU are just not going to get along. Even if you kludge around the sNaN->qNaN adventures, you're still going to suffer from double rounding as every computation is first rounded to 80 bits and then to 32 or 64.

Having Rust produce results that depend on the underlying hardware seems antithetical to some pretty fundamental language goals. But, that would mean abandoning the x87 FPU entirely, and that doesn't seem feasible even though sse2, which provides the necessary native 32/64 binary format support is nearly old enough to serve in the US house of representatives.

I'd encourage someone to figure out how to tell applications that the underlying hardware doesn't work the way they expect. Mirroring some version of FLT_EVAL_METHOD from C would at least be easy and sufficient for this hardware.

@RalfJung
Copy link
Member Author

RalfJung commented Jul 4, 2024

It's a stretch to claim that the 8087 FPU has 'unsound' floating point behavior;

That's not the claim (so if it sounds like that is our claim we should fix that). The claim is that the behavior of LLVM's optimizer and backend in combination with that of the x87 FPU is unsound.

Having hardware-dependent behavior in an operation that is specified to be portable and deterministic would also still be unsound, but could conceivably be fixed by changing the spec. But an internally inconsistent backend is wrong for every reasonable spec.

(Whether we'd really want such a target-specific spec is a different question. Rust code is pretty portable by default. Do we really want to require all Rust code to have to deal with non-standard precision float arithmetic? Or should code have some explicit opt-in / opt-out for niche targets with non-standard behavior? For this thread the question is rather moot as reliably implementing consistent non-standard behavior would at best be a lot of work [probably involving using LLVM's strictfp operations everywhere, but I am not sure if even that would be enough] with little pay-off. That work is IMO better spent implementing the standard semantics on x87. A very similar question comes up with CHERI hardware; in that case, the intended approach for now is to experiment with a tier 3 target. In the future we might need crates to be able to indicate whether they support such targets.)

I'd encourage someone to figure out how to tell applications that the underlying hardware doesn't work the way they expect. Mirroring some version of FLT_EVAL_METHOD from C would at least be easy and sufficient for this hardware.

As @beetrees explained, it's not just that the underlying hardware works differently and that bleeds into language semantics. It's that Rust's primary backend, LLVM, assumes the underlying hardware to work the standard way -- the examples @beetrees referenced demonstrate that there is no reliable way to program against this hardware in any compiler that uses LLVM as its backend. (At least not if the compiler uses the standard LLVM float types and operations.)

To my knowledge, nobody on the LLVM side really cares about this. So until that changes it is unlikely that we'll be able to improve things in Rust here. Telling programmers "on this hardware your program may randomly explode for no fault of your own" is not very useful. (I mean, it is a useful errata of course, but it doesn't make sense to make this the spec.)

Even if you kludge around the sNaN->qNaN adventures, you're still going to suffer from double rounding as every computation is first rounded to 80 bits and then to 32 or 64.

For f32, double-rounding does not affect results. For f64, it's possible to get around that using the approach Java used, IIUC. It does cost some performance but it avoids the semantic inconsistencies.

We could also say that Rust code expects the FPU to be set to 64bit precision on such targets. That seems a lot easier...

@DemiMarie
Copy link
Contributor

There are legitimate use-cases for running with the FPU in flush-to-zero + denormals-are-zero mode, even on mainstream targets like x86-64. Processing denormals outside of FTZ+DAZ mode requires a microcode assist that causes ~100 cycles IIRC. In this application, denormal numbers correspond to sounds that are (IIUC) below the noise floor, so flushing them to zero is harmless. Failing to meet deadlines because of the microcode assist is a bug.

That’s separate but related to this issue.

@Muon
Copy link

Muon commented Jul 5, 2024

For f32, double-rounding does not affect results.

This is only true for some operations, and then only if a single step is performed. It is not true for addition/subtraction, and it is not true even for other operations if multiple operations are performed at 80-bit precision.

There are legitimate use-cases for running with the FPU in flush-to-zero + denormals-are-zero mode, even on mainstream targets like x86-64.

Yes, but LLVM's support for anything except standard IEEE floating point arithmetic is broken or absent, so Rust can't hope to support this until LLVM does.

@beetrees
Copy link
Contributor

beetrees commented Jul 5, 2024

It is not true for addition/subtraction

For f32, a single a + b always gives the same result as (a as f80 + b as f80) as f32 per this paper.

@RalfJung
Copy link
Member Author

RalfJung commented Jul 5, 2024

There are legitimate use-cases for running with the FPU in flush-to-zero + denormals-are-zero mode, even on mainstream targets like x86-64.

Flushing denormals isn't even permitted by the IEEE standard. So while I am not doubting that that is something people may want to do, supporting such non-standards-compliant hardware requires inline assembly or new language features (and, depending on the constraints, work on the LLVM side). This is related to supporting floating point exception flags and non-default rounding modes, which is part of the standard but not supported in Rust. Please take that to another thread.

and it is not true even for other operations if multiple operations are performed at 80-bit precision.

Indeed the rounding down to f32 has to happen after each operation.

@apiraino
Copy link
Contributor

apiraino commented Sep 3, 2024

WG-prioritization assigning priority (see Zulip discussion for the related issue #129880).

@rustbot label -I-prioritize +P-medium

@rustbot rustbot added P-medium Medium priority and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Sep 3, 2024
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this issue Sep 6, 2024
…ilee

enable const-float-classify test, and test_next_up/down on 32bit x86

The  test_next_up/down tests have been disabled on all 32bit x86 targets, which goes too far -- they should definitely work on our (tier 1) i686 target, it is only without SSE that we might run into trouble due to rust-lang#114479. However, I cannot reproduce that trouble any more -- maybe that got fixed by rust-lang#123351?

The  const-float-classify test relied on const traits "because we can", and got disabled when const traits got removed. That's an unfortunate reduction in test coverage of our float functionality, so let's restore the test in a way that does not rely on const traits.

The const-float tests are actually testing runtime behavior as well, and I don't think that runtime behavior is covered anywhere else. Probably they shouldn't be called "const-float", but we don't have a `tests/ui/float` folder... should I create one and move them there? Are there any other ui tests that should be moved there?

I also removed some FIXME referring to not use x87 for Rust-to-Rust-calls -- that has happened in rust-lang#123351 so this got fixed indeed. Does that mean we can simplify all that float code again? I am not sure how to test it. Is running the test suite with an i586 target enough?

Cc `@tgross35` `@workingjubilee`
workingjubilee added a commit to workingjubilee/rustc that referenced this issue Sep 11, 2024
…ilee

enable const-float-classify test, and test_next_up/down on 32bit x86

The  test_next_up/down tests have been disabled on all 32bit x86 targets, which goes too far -- they should definitely work on our (tier 1) i686 target, it is only without SSE that we might run into trouble due to rust-lang#114479. However, I cannot reproduce that trouble any more -- maybe that got fixed by rust-lang#123351?

The  const-float-classify test relied on const traits "because we can", and got disabled when const traits got removed. That's an unfortunate reduction in test coverage of our float functionality, so let's restore the test in a way that does not rely on const traits.

The const-float tests are actually testing runtime behavior as well, and I don't think that runtime behavior is covered anywhere else. Probably they shouldn't be called "const-float", but we don't have a `tests/ui/float` folder... should I create one and move them there? Are there any other ui tests that should be moved there?

I also removed some FIXME referring to not use x87 for Rust-to-Rust-calls -- that has happened in rust-lang#123351 so this got fixed indeed. Does that mean we can simplify all that float code again? I am not sure how to test it. Is running the test suite with an i586 target enough?

Cc `@tgross35` `@workingjubilee`
workingjubilee added a commit to workingjubilee/rustc that referenced this issue Sep 11, 2024
…ilee

enable const-float-classify test, and test_next_up/down on 32bit x86

The  test_next_up/down tests have been disabled on all 32bit x86 targets, which goes too far -- they should definitely work on our (tier 1) i686 target, it is only without SSE that we might run into trouble due to rust-lang#114479. However, I cannot reproduce that trouble any more -- maybe that got fixed by rust-lang#123351?

The  const-float-classify test relied on const traits "because we can", and got disabled when const traits got removed. That's an unfortunate reduction in test coverage of our float functionality, so let's restore the test in a way that does not rely on const traits.

The const-float tests are actually testing runtime behavior as well, and I don't think that runtime behavior is covered anywhere else. Probably they shouldn't be called "const-float", but we don't have a `tests/ui/float` folder... should I create one and move them there? Are there any other ui tests that should be moved there?

I also removed some FIXME referring to not use x87 for Rust-to-Rust-calls -- that has happened in rust-lang#123351 so this got fixed indeed. Does that mean we can simplify all that float code again? I am not sure how to test it. Is running the test suite with an i586 target enough?

Cc ``@tgross35`` ``@workingjubilee``
workingjubilee added a commit to workingjubilee/rustc that referenced this issue Sep 11, 2024
…ilee

enable const-float-classify test, and test_next_up/down on 32bit x86

The  test_next_up/down tests have been disabled on all 32bit x86 targets, which goes too far -- they should definitely work on our (tier 1) i686 target, it is only without SSE that we might run into trouble due to rust-lang#114479. However, I cannot reproduce that trouble any more -- maybe that got fixed by rust-lang#123351?

The  const-float-classify test relied on const traits "because we can", and got disabled when const traits got removed. That's an unfortunate reduction in test coverage of our float functionality, so let's restore the test in a way that does not rely on const traits.

The const-float tests are actually testing runtime behavior as well, and I don't think that runtime behavior is covered anywhere else. Probably they shouldn't be called "const-float", but we don't have a `tests/ui/float` folder... should I create one and move them there? Are there any other ui tests that should be moved there?

I also removed some FIXME referring to not use x87 for Rust-to-Rust-calls -- that has happened in rust-lang#123351 so this got fixed indeed. Does that mean we can simplify all that float code again? I am not sure how to test it. Is running the test suite with an i586 target enough?

Cc ```@tgross35``` ```@workingjubilee```
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Sep 12, 2024
Rollup merge of rust-lang#129835 - RalfJung:float-tests, r=workingjubilee

enable const-float-classify test, and test_next_up/down on 32bit x86

The  test_next_up/down tests have been disabled on all 32bit x86 targets, which goes too far -- they should definitely work on our (tier 1) i686 target, it is only without SSE that we might run into trouble due to rust-lang#114479. However, I cannot reproduce that trouble any more -- maybe that got fixed by rust-lang#123351?

The  const-float-classify test relied on const traits "because we can", and got disabled when const traits got removed. That's an unfortunate reduction in test coverage of our float functionality, so let's restore the test in a way that does not rely on const traits.

The const-float tests are actually testing runtime behavior as well, and I don't think that runtime behavior is covered anywhere else. Probably they shouldn't be called "const-float", but we don't have a `tests/ui/float` folder... should I create one and move them there? Are there any other ui tests that should be moved there?

I also removed some FIXME referring to not use x87 for Rust-to-Rust-calls -- that has happened in rust-lang#123351 so this got fixed indeed. Does that mean we can simplify all that float code again? I am not sure how to test it. Is running the test suite with an i586 target enough?

Cc ```@tgross35``` ```@workingjubilee```
github-actions bot pushed a commit to rust-lang/miri that referenced this issue Sep 12, 2024
enable const-float-classify test, and test_next_up/down on 32bit x86

The  test_next_up/down tests have been disabled on all 32bit x86 targets, which goes too far -- they should definitely work on our (tier 1) i686 target, it is only without SSE that we might run into trouble due to rust-lang/rust#114479. However, I cannot reproduce that trouble any more -- maybe that got fixed by rust-lang/rust#123351?

The  const-float-classify test relied on const traits "because we can", and got disabled when const traits got removed. That's an unfortunate reduction in test coverage of our float functionality, so let's restore the test in a way that does not rely on const traits.

The const-float tests are actually testing runtime behavior as well, and I don't think that runtime behavior is covered anywhere else. Probably they shouldn't be called "const-float", but we don't have a `tests/ui/float` folder... should I create one and move them there? Are there any other ui tests that should be moved there?

I also removed some FIXME referring to not use x87 for Rust-to-Rust-calls -- that has happened in #123351 so this got fixed indeed. Does that mean we can simplify all that float code again? I am not sure how to test it. Is running the test suite with an i586 target enough?

Cc ```@tgross35``` ```@workingjubilee```
@workingjubilee workingjubilee added the I-miscompile Issue: Correct Rust code lowers to incorrect machine code label Oct 2, 2024
freebsd-git pushed a commit to freebsd/freebsd-ports that referenced this issue Nov 9, 2024
Upstream Rust always requires SSE2 for x86.  But back in 2017[^1][^2] we
patched lang/rust to disable SSE2 for i386.  At the time, it was
reported that some people were still using non-SSE2 capable hardware.
More recently, LLVM bugs have been discovered[^3][^4] that can result in
rounding bugs and reduced accuracy when using f64 on non-SSE hardware.
In weird cases, they can even cause wilder unpredictable behavior, like
segfaults.

Revert our change for the sake of Pentium 4 (and later) users.  But add
an SSE2 option.  Disabling it will allow the port to be used on Pentium
3 and older CPUs.

[^1]: d65b288
[^2]: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=223415
[^3]: rust-lang/rust#114479
[^4]: llvm/llvm-project#44218

Reviewed by: emaste
Differential Revision: https://reviews.freebsd.org/D47227
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-floating-point Area: Floating point numbers and arithmetic C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC I-miscompile Issue: Correct Rust code lowers to incorrect machine code I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness O-x86_32 Target: x86 processors, 32 bit (like i686-*) P-medium Medium priority
Projects
None yet
Development

No branches or pull requests