-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add IEEE 754 compliant fmt/parse of -0, infinity, NaN #78618
Conversation
(rust_highfive has picked a reviewer for you, use r? to override) |
c12c4b6
to
d229327
Compare
ea20310
to
9844bdb
Compare
We briefly discussed this in the recent Libs meeting and felt that It turns out the question of always printing Since this turned out to be a bit of a bigger topic than we really had synchronous time for, I think we should try revisit it in more detail with a picture of what a roadmap towards compliance would look like for float |
Triage: So i created #79490 to tracking the controversal part, and i'll mark this as blocked on the decision of that issue. @workingjubilee I think you can go ahead and split out PRs for the non-controversal part of this ( |
To reply to the stability concerns:
My apologies, I know this PR has been open for a while without much remark, but that is because I am not interested in pursuing this line of inquiry further if this is in fact a reason to turn this PR down. Accordingly, I will not be splitting the changeset. From my perspective, either Rust has IEEE754 floats which behave in the expected way, in accord with the standard, or it does not. If it does not, then what the standard says about parsing NaN and such is equally moot. |
Well, #31407 (comment) seems to imply that non-NaN floats are intended to round-trip on using |
☔ The latest upstream changes (presumably #78259) made this pull request unmergeable. Please resolve the merge conflicts. |
We discussed this in the @rust-lang/libs meeting this week. We had consensus that changing floating-point printing to print the negative sign on -0 seems reasonable. We don't expect that to cause a problem in practice. I'm going to go ahead and start a libs FCP. Please go ahead and update the PR to apply without conflicts. @rfcbot merge |
Team member @joshtriplett has proposed to merge this. The next step is review by the rest of the tagged team members: Concerns:
Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
One thing I don't see called out in the discussion above or any of the linked issues is the case lenient parsing in this PR, e.g. @rfcbot concern case insensitivity |
@dtolnay From IEEE 754-2008 (I don't have 2019 available), "5.12.1 External character sequences representing zeros, infinities, and NaNs":
|
☀️ Test successful - checks-actions |
This copies the standard library behavior fix of rust-lang/rust#78618
This copies the standard library behavior fix of <rust-lang/rust#78618>.
This copies some of the standard library behavior fix of <rust-lang/rust#78618>.
This copies some of the standard library fixes of <rust-lang/rust#78618>.
214: Ignore case for float parsing of text special values r=cuviper a=tspiteri This copies some of the standard library fixes of <rust-lang/rust#78618>. Co-authored-by: Trevor Spiteri <[email protected]>
In SQLite's SLT, floats are printed as integers by casting the float to an integer then printing the integer. Our SLT runner was printing floats rounding the float then printing the float with zero digits after the decimal point. This is almost the same thing, except that "0" is printed as "-0" since rust-lang/rust#78618. Future proof the code (and make it work the same way in the nightly coverage build) by following the SQLite approach. Supersedes MaterializeInc#7096.
In SQLite's SLT, floats are printed as integers by casting the float to an integer then printing the integer. Our SLT runner was printing floats rounding the float then printing the float with zero digits after the decimal point. This is almost the same thing, except that "0" is printed as "-0" since rust-lang/rust#78618. Future proof the code (and make it work the same way in the nightly coverage build) by following the SQLite approach. Supersedes MaterializeInc#7096.
Pkgsrc changes: * Bump bootstrap requirements to 1.52.1 * Adjust patches, adapt to upstream changes, adjust cargo checksums * If using an external llvm, require >= 10.0 Upsteream changes: Version 1.53.0 (2021-06-17) ============================ Language ----------------------- - [You can now use unicode for identifiers.][83799] This allows multilingual identifiers but still doesn't allow glyphs that are not considered characters such as `#` (diamond) or `<U+1F980>` (crab). More specifically you can now use any identifier that matches the UAX #31 "Unicode Identifier and Pattern Syntax" standard. This is the same standard as languages like Python, however Rust uses NFC normalization which may be different from other languages. - [You can now specify "or patterns" inside pattern matches.][79278] Previously you could only use `|` (OR) on complete patterns. E.g. ```rust let x = Some(2u8); // Before matches!(x, Some(1) | Some(2)); // Now matches!(x, Some(1 | 2)); ``` - [Added the `:pat_param` `macro_rules!` matcher.][83386] This matcher has the same semantics as the `:pat` matcher. This is to allow `:pat` to change semantics to being a pattern fragment in a future edition. Compiler ----------------------- - [Updated the minimum external LLVM version to LLVM 10.][83387] - [Added Tier 3\* support for the `wasm64-unknown-unknown` target.][80525] - [Improved debuginfo for closures and async functions on Windows MSVC.][83941] \* Refer to Rust's [platform support page][platform-support-doc] for more information on Rust's tiered platform support. Libraries ----------------------- - [Abort messages will now forward to `android_set_abort_message` on Android platforms when available.][81469] - [`slice::IterMut<'_, T>` now implements `AsRef<[T]>`][82771] - [Arrays of any length now implement `IntoIterator`.][84147] Currently calling `.into_iter()` as a method on an array will return `impl Iterator<Item=&T>`, but this may change in a future edition to change `Item` to `T`. Calling `IntoIterator::into_iter` directly on arrays will provide `impl Iterator<Item=T>` as expected. - [`leading_zeros`, and `trailing_zeros` are now available on all `NonZero` integer types.][84082] - [`{f32, f64}::from_str` now parse and print special values (`NaN`, `-0`) according to IEEE RFC 754.][78618] - [You can now index into slices using `(Bound<usize>, Bound<usize>)`.][77704] - [Add the `BITS` associated constant to all numeric types.][82565] Stabilised APIs --------------- - [`AtomicBool::fetch_update`] - [`AtomicPtr::fetch_update`] - [`BTreeMap::retain`] - [`BTreeSet::retain`] - [`BufReader::seek_relative`] - [`DebugStruct::non_exhaustive`] - [`Duration::MAX`] - [`Duration::ZERO`] - [`Duration::is_zero`] - [`Duration::saturating_add`] - [`Duration::saturating_mul`] - [`Duration::saturating_sub`] - [`ErrorKind::Unsupported`] - [`Option::insert`] - [`Ordering::is_eq`] - [`Ordering::is_ge`] - [`Ordering::is_gt`] - [`Ordering::is_le`] - [`Ordering::is_lt`] - [`Ordering::is_ne`] - [`OsStr::is_ascii`] - [`OsStr::make_ascii_lowercase`] - [`OsStr::make_ascii_uppercase`] - [`OsStr::to_ascii_lowercase`] - [`OsStr::to_ascii_uppercase`] - [`Peekable::peek_mut`] - [`Rc::decrement_strong_count`] - [`Rc::increment_strong_count`] - [`Vec::extend_from_within`] - [`array::from_mut`] - [`array::from_ref`] - [`char::MAX`] - [`char::REPLACEMENT_CHARACTER`] - [`char::UNICODE_VERSION`] - [`char::decode_utf16`] - [`char::from_digit`] - [`char::from_u32_unchecked`] - [`char::from_u32`] - [`cmp::max_by_key`] - [`cmp::max_by`] - [`cmp::min_by_key`] - [`cmp::min_by`] - [`f32::is_subnormal`] - [`f64::is_subnormal`] Cargo ----------------------- - [Cargo now supports git repositories where the default `HEAD` branch is not "master".][cargo/9392] This also includes a switch to the version 3 `Cargo.lock` format which can handle default branches correctly. - [macOS targets now default to `unpacked` split-debuginfo.][cargo/9298] - [The `authors` field is no longer included in `Cargo.toml` for new projects.][cargo/9282] Rustdoc ----------------------- - [Added the `rustdoc::bare_urls` lint that warns when you have URLs without hyperlinks.][81764] Compatibility Notes ------------------- - [Implement token-based handling of attributes during expansion][82608] - [`Ipv4::from_str` will now reject octal format IP addresses in addition to rejecting hexadecimal IP addresses.][83652] The octal format can lead to confusion and potential security vulnerabilities and [is no longer recommended][ietf6943]. Internal Only ------------- These changes provide no direct user facing benefits, but represent significant improvements to the internals and overall performance of rustc and related tools. - [Rework the `std::sys::windows::alloc` implementation.][83065] - [rustdoc: Don't enter an infer_ctxt in get_blanket_impls for impls that aren't blanket impls.][82864] - [rustdoc: Only look at blanket impls in `get_blanket_impls`][83681] - [Rework rustdoc const type][82873] [83386]: rust-lang/rust#83386 [82771]: rust-lang/rust#82771 [84147]: rust-lang/rust#84147 [84082]: rust-lang/rust#84082 [83799]: rust-lang/rust#83799 [83681]: rust-lang/rust#83681 [83652]: rust-lang/rust#83652 [83387]: rust-lang/rust#83387 [82873]: rust-lang/rust#82873 [82864]: rust-lang/rust#82864 [82608]: rust-lang/rust#82608 [82565]: rust-lang/rust#82565 [80525]: rust-lang/rust#80525 [79278]: rust-lang/rust#79278 [78618]: rust-lang/rust#78618 [77704]: rust-lang/rust#77704 [83941]: rust-lang/rust#83941 [83065]: rust-lang/rust#83065 [81764]: rust-lang/rust#81764 [81469]: rust-lang/rust#81469 [cargo/9298]: rust-lang/cargo#9298 [cargo/9282]: rust-lang/cargo#9282 [cargo/9392]: rust-lang/cargo#9392 [`char::MAX`]: https://doc.rust-lang.org/std/primitive.char.html#associatedconstant.MAX [`char::REPLACEMENT_CHARACTER`]: https://doc.rust-lang.org/std/primitive.char.html#associatedconstant.REPLACEMENT_CHARACTER [`char::UNICODE_VERSION`]: https://doc.rust-lang.org/std/primitive.char.html#associatedconstant.UNICODE_VERSION [`char::decode_utf16`]: https://doc.rust-lang.org/std/primitive.char.html#method.decode_utf16 [`char::from_u32`]: https://doc.rust-lang.org/std/primitive.char.html#method.from_u32 [`char::from_u32_unchecked`]: https://doc.rust-lang.org/std/primitive.char.html#method.from_u32_unchecked [`char::from_digit`]: https://doc.rust-lang.org/std/primitive.char.html#method.from_digit [`AtomicBool::fetch_update`]: https://doc.rust-lang.org/std/sync/atomic/struct.AtomicBool.html#method.fetch_update [`AtomicPtr::fetch_update`]: https://doc.rust-lang.org/std/sync/atomic/struct.AtomicPtr.html#method.fetch_update [`BTreeMap::retain`]: https://doc.rust-lang.org/std/collections/struct.BTreeMap.html#method.retain [`BTreeSet::retain`]: https://doc.rust-lang.org/std/collections/struct.BTreeSet.html#method.retain [`BufReader::seek_relative`]: https://doc.rust-lang.org/std/io/struct.BufReader.html#method.seek_relative [`DebugStruct::non_exhaustive`]: https://doc.rust-lang.org/std/fmt/struct.DebugStruct.html#method.finish_non_exhaustive [`Duration::MAX`]: https://doc.rust-lang.org/std/time/struct.Duration.html#associatedconstant.MAX [`Duration::ZERO`]: https://doc.rust-lang.org/std/time/struct.Duration.html#associatedconstant.ZERO [`Duration::is_zero`]: https://doc.rust-lang.org/std/time/struct.Duration.html#method.is_zero [`Duration::saturating_add`]: https://doc.rust-lang.org/std/time/struct.Duration.html#method.saturating_add [`Duration::saturating_mul`]: https://doc.rust-lang.org/std/time/struct.Duration.html#method.saturating_mul [`Duration::saturating_sub`]: https://doc.rust-lang.org/std/time/struct.Duration.html#method.saturating_sub [`ErrorKind::Unsupported`]: https://doc.rust-lang.org/std/io/enum.ErrorKind.html#variant.Unsupported [`Option::insert`]: https://doc.rust-lang.org/std/option/enum.Option.html#method.insert [`Ordering::is_eq`]: https://doc.rust-lang.org/std/cmp/enum.Ordering.html#method.is_eq [`Ordering::is_ge`]: https://doc.rust-lang.org/std/cmp/enum.Ordering.html#method.is_ge [`Ordering::is_gt`]: https://doc.rust-lang.org/std/cmp/enum.Ordering.html#method.is_gt [`Ordering::is_le`]: https://doc.rust-lang.org/std/cmp/enum.Ordering.html#method.is_le [`Ordering::is_lt`]: https://doc.rust-lang.org/std/cmp/enum.Ordering.html#method.is_lt [`Ordering::is_ne`]: https://doc.rust-lang.org/std/cmp/enum.Ordering.html#method.is_ne [`OsStr::eq_ignore_ascii_case`]: https://doc.rust-lang.org/std/ffi/struct.OsStr.html#method.eq_ignore_ascii_case [`OsStr::is_ascii`]: https://doc.rust-lang.org/std/ffi/struct.OsStr.html#method.is_ascii [`OsStr::make_ascii_lowercase`]: https://doc.rust-lang.org/std/ffi/struct.OsStr.html#method.make_ascii_lowercase [`OsStr::make_ascii_uppercase`]: https://doc.rust-lang.org/std/ffi/struct.OsStr.html#method.make_ascii_uppercase [`OsStr::to_ascii_lowercase`]: https://doc.rust-lang.org/std/ffi/struct.OsStr.html#method.to_ascii_lowercase [`OsStr::to_ascii_uppercase`]: https://doc.rust-lang.org/std/ffi/struct.OsStr.html#method.to_ascii_uppercase [`Peekable::peek_mut`]: https://doc.rust-lang.org/std/iter/struct.Peekable.html#method.peek_mut [`Rc::decrement_strong_count`]: https://doc.rust-lang.org/std/rc/struct.Rc.html#method.increment_strong_count [`Rc::increment_strong_count`]: https://doc.rust-lang.org/std/rc/struct.Rc.html#method.increment_strong_count [`Vec::extend_from_within`]: https://doc.rust-lang.org/beta/std/vec/struct.Vec.html#method.extend_from_within [`array::from_mut`]: https://doc.rust-lang.org/beta/std/array/fn.from_mut.html [`array::from_ref`]: https://doc.rust-lang.org/beta/std/array/fn.from_ref.html [`cmp::max_by_key`]: https://doc.rust-lang.org/beta/std/cmp/fn.max_by_key.html [`cmp::max_by`]: https://doc.rust-lang.org/beta/std/cmp/fn.max_by.html [`cmp::min_by_key`]: https://doc.rust-lang.org/beta/std/cmp/fn.min_by_key.html [`cmp::min_by`]: https://doc.rust-lang.org/beta/std/cmp/fn.min_by.html [`f32::is_subnormal`]: https://doc.rust-lang.org/std/primitive.f64.html#method.is_subnormal [`f64::is_subnormal`]: https://doc.rust-lang.org/std/primitive.f64.html#method.is_subnormal [ietf6943]: https://datatracker.ietf.org/doc/html/rfc6943#section-3.1.1
In SQLite's SLT, floats are printed as integers by casting the float to an integer then printing the integer. Our SLT runner was printing floats rounding the float then printing the float with zero digits after the decimal point. This is almost the same thing, except that "0" is printed as "-0" since rust-lang/rust#78618. Future proof the code (and make it work the same way in the nightly coverage build) by following the SQLite approach. Supersedes MaterializeInc#7096.
In SQLite's SLT, floats are printed as integers by casting the float to an integer then printing the integer. Our SLT runner was printing floats rounding the float then printing the float with zero digits after the decimal point. This is almost the same thing, except that "0" is printed as "-0" since rust-lang/rust#78618. Future proof the code (and make it work the same way in the nightly coverage build) by following the SQLite approach. Supersedes MaterializeInc#7096.
Is there any fmt option to remove |
This pull request improves the Rust float formatting/parsing libraries to comply with IEEE 754's formatting expectations around certain special values, namely signed zero, the infinities, and NaN. It also adds IEEE 754 compliance tests that, while less stringent in certain places than many of the existing flt2dec/dec2flt capability tests, are intended to serve as the beginning of a roadmap to future compliance with the standard. Some relevant documentation is also adjusted with clarifying remarks.
This PR follows from discussion in rust-lang/rfcs#1074, and closes #24623.
The most controversial change here is likely to be that -0 is now printed as -0. Allow me to explain: While there appears to be community support for an opt-in toggle of printing floats as if they exist in the naively expected domain of numbers, i.e. not the extended reals (where floats live), IEEE 754-2019 is clear that a float converted to a string should be capable of being transformed into the original floating point bit-pattern when it satisfies certain conditions (namely, when it is an actual numeric value i.e. not a NaN and the original and destination float width are the same). -0 is given special attention here as a value that should have its sign preserved. In addition, the vast majority of other programming languages not only output
-0
but output-0.0
here.While IEEE 754 offers a broad leeway in how to handle producing what it calls a "decimal character sequence", it is clear that the operations a language provides should be capable of round tripping, and it is confusing to advertise the f32 and f64 types as binary32 and binary64 yet have the most basic way of producing a string and then reading it back into a floating point number be non-conformant with the standard. Further, existing documentation suggested that e.g. -0 would be printed with -0 regardless of the presence of the
+
fmt character, but it prints "+0" instead if given such (which was what led to the opening of #24623).There are other parsing and formatting issues for floating point numbers which prevent Rust from complying with the standard, as well as other well-documented challenges on the arithmetic level, but I hope that this can be the beginning of motion towards solving those challenges.