-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additional float types #3451
Additional float types #3451
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
# Drawbacks | ||
[drawbacks]: #drawbacks | ||
|
||
Unlike f32 and f64, although there are platform independent implementation of supplementary intrinsics on these types, not every target support the two types natively, with regards to the ABI. Adding them will be a challenge for handling different cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not every platform supports f32 and f64 natively either. For example, RISC-V without the F or D extensions (ex: ISA string of rv64i
). This should be mentioned.
Whatever emulation Rust already does to support f32
and f64
on systems without native support should similarly happen to emulate f128
on systems without native quadruple-precision support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For riscv without hardware float support there is a defined soft-float ABI. There is not for f16/bf16. Same for x86_64. Many other architectures likely don't have a defined soft-float abi for f128 either. And as I understand it AArch64 doesn't have a soft-float abi at all as Neon support is mandatory and floats are even allowed inside the kernel unlike eg x86_64 where floats are disabled in the kernel to avoid having to save them on syscalls.
I'm okay with this for the most part, except that I disagree that A good example of this is Additionally, while I don't feel too strongly about it, I don't like the idea of naming the x86 "long double" type |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some more detail in various places, and added core::ffi::c_longdouble
(which I'm fairly certain we would want)
text/3451-additional-float-types.md
Outdated
# Summary | ||
[summary]: #summary | ||
|
||
This RFC proposes new floating point types `f16` and `f128` into core language and standard library. Also this RFC introduces `f80`, `doubledouble`, `bf16` into `core::arch` for inter-op with existing native code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This RFC proposes new floating point types `f16` and `f128` into core language and standard library. Also this RFC introduces `f80`, `doubledouble`, `bf16` into `core::arch` for inter-op with existing native code. | |
This RFC proposes new floating point types `f16` and `f128` into core language and standard | |
library. Also, this RFC introduces `f80`, `doubledouble`, and `bf16` into `core::arch` for | |
target-specific support, and `core::ffi::c_longdouble` for FFI interop. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest use these symbols, all start with f
prefix that consistence
f128
: C_Float128
, LLVMfp128
, GCC__float128
f16
: C_Float16
, LLVMhalf
, GCC__fp16
f16b
: C++std::bfloat16_t
, LLVMbfloat
, GCC__bf16
f80e
: LLVMx86_fp80
, GCC__float80
f64f64
: LLVMppc_fp128
, GCC__ibm128
- e means extention, not standard IEEE,
- f64f64 is shorter than
doubledouble
, or maybef64f64e
means not standard - @joshtriplett already proposed f16b a long time ago: https://github.com/joshtriplett/rfcs/blob/f16b/text/0000-f16b.md
And these symbols can be used as literal suffix as is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for f16b
in favor over bf16
for consistency, I liked that about @joshtriplett's original proposal.
I don't think we should introduce something like f128x
- doubledouble
or something like the GCC or LLVM types would be better IMO. Reason being, it's kind of ambiguous and specific to one architecture - PowerPC is even moving away from it. Better to give it an unambigous name since it will be used relatively rarely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we should use bf16
rather than f16b
since that is widely recognized whereas f16b
isn't, and f64_f64
instead of f128x
since it really is 2 f64
values and could be easily emulated on any other architecture (do not use f64x2
since that's already used by Simd<f64, 2>
). also f<N>x
names are more or less defined by IEEE 754 to be wider than N
bits, so e.g. f64x
would be approximately any type wider than f64
but less than f128
such as f80
, f16x
could be the f24
type used by some GPUs for depth buffers. so logically f80x
would need to be more than 80 bits and f128x
would need to be more than 128 bits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for
f16b
in favor overbf16
for consistency, I liked that about @joshtriplett's original proposal.I don't think we should introduce something like
f128x
-doubledouble
or something like the GCC or LLVM types would be better IMO. Reason being, it's kind of ambiguous and specific to one architecture - PowerPC is even moving away from it. Better to give it an unambigous name since it will be used relatively rarely.
f64f64
is OK, no need the underscore looks like doubledouble
, f80e
instead f80x
if f80x
is not acceptable
Still vote for f16b
, It's rust specific, we can create relationship between bf16
and f16b
in rust, that's won't be a burden.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about f64x2
to indicate it's two f64 glued together?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do not use f64x2 since that's already used by Simd<f64, 2>
text/3451-additional-float-types.md
Outdated
|
||
All operators, constants and math functions defined for `f32` and `f64` in core, are also defined for `f16` and `f128`, and guarded by respective conditional guards. | ||
|
||
`f80` type is defined in `core::arch::{x86, x86_64}`. `doubledouble` type is defined in `core::arch::{powerpc, powerpc64}`. `bf16` type is defined in `core::arch::{arm, aarch64, x86, x86_64}`. They do not have literal representation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`f80` type is defined in `core::arch::{x86, x86_64}`. `doubledouble` type is defined in `core::arch::{powerpc, powerpc64}`. `bf16` type is defined in `core::arch::{arm, aarch64, x86, x86_64}`. They do not have literal representation. | |
The `f80` type is defined in `core::arch::{x86, x86_64}` as 80-bit extended precision. The `doubledouble` | |
type is defined in `core::arch::{powerpc, powerpc64}` and represent's PowerPC's non-IEEE double-double | |
format (two `f64`s used to aproximate `f128`). `bf16` type is defined in `core::arch::{arm, aarch64, x86, x86_64}` and represents the "brain" float, a truncated `f32` with SIMD support on some hardware. These | |
types do not have literal representation. | |
When working with FFI, the `core::ffi::c_longdouble` type can be used to match whatever type | |
`long double` represents in C. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I did not add the mention of longdouble
yet. More things need to be clarified:
- Is there always only 1 long double for each
(arch, abi, os)
tuple? For example,powerpc64le-unknown-linux-gnu
can use either double or doubledouble or IEEE binary128 aslong double
by-mabi=(ieee|ibm)longdouble
and-mlong-double-(64|128)
. - Is mangling of
long double
the same regardless of its underlying semantics? - Some targets (also
powerpc64le
for example) support.gnu_attribute
, so that linker can differentiate objects compiled by different long double ABI. Should Rust programs usingc_longdouble
emit such attribute?
text/3451-additional-float-types.md
Outdated
|
||
`f128` is available for on targets having (1) hardware instructions or software emulation for 128-bit float type; (2) backend support for `f128` type on the target; (3) essential target features enabled (if any). | ||
|
||
The list of targets supporting `f128` type may change over time. Initially, it includes `powerpc64le-*`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The list of targets supporting `f128` type may change over time. Initially, it includes `powerpc64le-*`. | |
The list of targets supporting `f128` type may change over time. Initially, it includes `powerpc64le-*`. | |
`x86_64-*` and `aarch64-*` |
I don't know for sure what targets support it, but we should aim to at least support the major 64-bit CPUs
here at first
There is also a risc target per @aaronfranke here #2629 (comment) but I'm not sure how rv64gQc
maps to our riscv64gc
I agree with the comments about A good comment from the issue thread that summed this up: #2629 (comment) |
text/3451-additional-float-types.md
Outdated
|
||
All operators, constants and math functions defined for `f32` and `f64` in core, are also defined for `f16` and `f128`, and guarded by respective conditional guards. | ||
|
||
`f80` type is defined in `core::arch::{x86, x86_64}`. `doubledouble` type is defined in `core::arch::{powerpc, powerpc64}`. `bf16` type is defined in `core::arch::{arm, aarch64, x86, x86_64}`. They do not have literal representation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bf16
is supported on a wide range of newer architectures, such as powerpc, x86, arm, and (WIP) risc-v. imho it should not be classified as architecture-specific but instead more like f16
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeap bf16 should be simulated when target arch is not supported
Note that supporting 80-bit floats (at all) might complicate the situation being discussed in rust-lang/rust#113053, which suggests that we might be able to manipulate the control word (to set a 53 bit mantissa) in order to get more consistent behavior for f64. |
Would it make sense to consider a subset? Especially the 16bit floats are a growing field compared to the legacy types. |
I started trying to get together a better collection of what platforms & features support the different types to serve as a reference when implementing and help steer choices for this RFC (we can pull the contents in at some point). The doc is editable (if logged in), please add info & links if you know of any: https://hackmd.io/@8W5l8q6-Qyyn_vKh2-L0RQ/rJpZsmcdh/edit Also, zulip thread for this RFC: https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/f16.20.26.20f128.20RFC |
note that PowerISA's BFP128 extension has a specific instruction to aid emulation of |
+1 — GPU programming may involve storing and copying |
I would not say that it complicates anything. It's just another setting of the control word. In any case, such concerns are to be addressed in LLVM, not rustc. If my suggested changes are implemented and accepted, rustc doesn't need to change anything. Also I wholeheartedly support the 80 bit support. It makes life a lot nicer working with the x87. Though I agree that the name should make clear that it's not an IEEE format. I also agree that f16 and f128 should be emulated if necessary. Consistent universal support is invaluable. |
@the8472 If it helped with acceptance, I think splitting |
Agreed, though I definitely think it should stay in By contrast, I think |
I agree that we should basically consider |
text/3451-additional-float-types.md
Outdated
|
||
All operators, constants and math functions defined for `f32` and `f64` in core, are also defined for `f16` and `f128`, and guarded by respective conditional guards. | ||
|
||
`f80` type is defined in `core::arch::{x86, x86_64}`. `doubledouble` type is defined in `core::arch::{powerpc, powerpc64}`. `bf16` type is defined in `core::arch::{arm, aarch64, x86, x86_64}`. They do not have literal representation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For people that have never heard of bf16
or doubledouble
(which I assume are 16 and 128 bits in size, respectively), it would be good to link to some sort of document explaining them, and how they differ from f16
and f128
, respectively.
Also the RFC needs to say what their semantics are, if IEEE doesn't specify them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nobody seems to agree on bf16
semantics:
arm has both round as normal with subnormals supported and round to odd with subnormals not supported.
x86 has round to nearest with subnormals not supported.
powerpc has round as normal with subnormals supported.
all isas have round towards zero with subnormals supported (just f32::to_bits(v) >> 16
).
Since a lot of people are sort of dancing around it, here's my take on a precise proposal for how to modify the RFC:
For For It should be relatively easy to add in other non-standard float types as experimental in the meantime. My proposal is that these should work very similarly to how I present My suspicion for something like EDIT: Since it's worth adding this after the fact, I wanted to explain a little bit more on why I think these dialed-back semantics are important. Right now, the biggest issue with floating-point semantics is how values are affected by arithmetic operations, like whether NaN payloads are preserved or how rounding is done. However, the one thing we do know about these formats is how you interpret a single value in memory, and that's why we can easily convert to and from these formats to IEEE 754, either exactly or by truncating. This should let us still create a standard type that is ABI-compatible with C (± any nonstandard extensions for particularly weird formats) without actually taking a firm stance on the semantics quite yet. The other big question about semantics is how the exact values in memory are preserved between operations, particularly the NaN payloads, although that has largely been decided with the stabilisation of |
is there a real demand for f16 ? Because, yes there just 2 bytes long but f16 are not very precise |
@Larsouille25 Another example, quantized mesh data, for some use cases of meshes you really don't care if a vertex is a fraction of a millimeter off, and meshes have a lot of vertices, so using |
Also adding onto what @aaronfranke said, in general, values between 0 and 1 are quite precise for f16s, since they have a 10-bit significand. These values not only are useful for colour, but loads of different things only need to store a float in this range and using up half the memory is quite a benefit. |
I opened #3453 which is mostly a subset of this RFC, a lot of text was copied from here. It only includes |
Prior art should also mention https://crates.io/crates/half, which is at ~2million downloads/month, showing there is a lot of demand for |
@Nemo157 There is already a link to https://github.com/starkat99/half-rs but a crate link may be better. |
Ah, under alternatives. I only checked the prior art section (and I think highlighting the wide usage in the RFC would be useful). |
With this in mind, it would probably be good to shave |
@tgross35 Your link is broken, I updated a new one https://hackmd.io/8Ptau058RmGqV0ZYJ49S9A |
Thank you, formatting fixed: https://hackmd.io/@8W5l8q6-Qyyn_vKh2-L0RQ/rJpZsmcdh |
I'm still doubting if it's feasible to support it on all platforms. From the view of backend (LLVM, more especially), (let F be either
|
don't worry, C++ already have both |
@ecnelises This is a good question, but I disagree about this point:
Yes, we can do something here. There is no reason we can't support our own emulation. Any calculation can be performed on a turing-complete computer. The specific details may be a challenge to implementers in some cases, but it is not impossible to support these types everywhere, therefore they should be supported everywhere. |
- Rename doubledouble to f64f64 - Add architecture info about x86 and arm (as suggested) - Add description to target related types (as suggested) - Add link to IEEE-754 and LLVM LangRef
If we do plan to implement the types everywhere, the 4 becomes 5. Even no
I think they're optional? https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1467r4.html |
Overall, I like the idea of separating out the real IEEE types from the random stuff, and making
I don't think that's important for a niche type, though. It doesn't need a name any nicer than For that matter, what's the goal of having a type for x87's weird things? It's mentioned in neither the Motivation nor the Rationale sections. How much new Rust code is going to be written targeting those types specifically instead of normal
Why not? It seems odd to me that Unless "these types" is supposed to be after a newline so it refers to all the arch-specific ones, maybe? |
One possibility I could see, is that if Rust some day decides to accept the performance hit and make |
@ecnelises would you be able to update this RFC to be about We can work on both RFCs in parallel but readers need to be clear about what the relationship between the two is - it's currently ambiguous. |
This revision also contains comments addressed from reviewers in RFC rust-lang#3451.
Thanks for your and other's comments. I generally agree that a proposal for IEEE-754 compliant I'm not familiar with Rust community's RFC policy. So I'm in doubt whether kicking off new post directly from another people without previous commit/author information shows enough respect to the original author. |
There is no policy on this, but consulting before splitting off your PR would have been, well, nice. It would probably be good for the two RFC authors to come to consensus on Zulip about what the relationship among these RFCs should be, and how to harmonize them a bit (or even figure out coauthoring) to help give the community a better picture of what exactly is being proposed. There is an existing floats rfc topic that can be used for discussion https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/f16.20.26.20f128.20RFC |
Rendered