`.mod_euc()` produces invalid ouput for small negative floats due to rounding errors #50179

clamydo · 2018-04-23T13:14:49Z

The newly added method std::f64::mod_euc() can produce values equal to the modulus when applied to a small negative float and sufficiently big modulus. AFAIK this should never happen.

I tried this code on nightly rustc 1.27.0-nightly (ac3c2288f 2018-04-18) and the version on playground:

#![feature(euclidean_division)]

fn main() {
    let f = -std::f64::EPSILON;
    let m = 3.0f64;

    assert!(f.mod_euc(m) < m);
}

https://play.rust-lang.org/?gist=01a506080183f14067553856de6763ab&version=nightly

I expected to see this happen:

The operation should produce only values in between 0. <= x < modulos.

Instead, this happened:

But it produced a value equal to the modulus, in this case 3.. This is true for all sufficiently big floats relative to the machine epsilon.

Discussion

It is likely a bug in the implementation ignoring some floating point arithmetic oddities. I haven't looked deeper yet.

The text was updated successfully, but these errors were encountered:

clamydo · 2018-04-23T13:22:49Z

It is currently implemented as

pub fn mod_euc(self, rhs: f64) -> f64 {
    let r = self % rhs;
    if r < 0.0 {
        r + rhs.abs()
    } else {
        r
    }
}

https://github.com/rust-lang/rust/blob/master/src/libstd/f64.rs#L236

What happens is that because -EPSILON is negative the modulus is added to it. Which still results in the modulus when it is big enough, because the operation is not representable in the given precision.

A more stable implementation would be, I believe, for example something like

pub fn mod_euc(self, rhs: f64) -> f64 {
    ((self % rhs) + rhs.abs()) % rhs
}

varkor · 2018-04-23T16:50:05Z

Thanks @fkjogu for the investigation, and good catch!
I wonder whether:

pub fn mod_euc(self, rhs: f64) -> f64 {
    let mut r = self % rhs;
    if r < 0.0 {
        r += rhs.abs();
        if r == rhs {
            return 0.0;
        }
    }
    r
}

would be a slightly better implementation, as it avoids the extra unnecessary division from the second %.

Correspondingly, I imagine div_euc is probably incorrect. The floating-point implementations were added later in the RFC, so they probably didn't get quite as much examination as the integer implementations...

ExpHP · 2018-04-23T23:35:04Z

This is also how it behaves in Python:

>>> -1e-18 % 1
1.0

and in Haskell:

λ> import Data.Fixed
λ> (-1e-18) `mod'` 1.0
1.0

For better or worse, it seems that virtually all possible implementations of floored or Euclidean modulus of floating point numbers naturally have this problem unless you explicitly account for this nasty edge case.

ExpHP · 2018-04-23T23:52:59Z

To play devil's advocate though, a colleague of mine has shared his story of the worst bug he ever had to debug, which was basically caused by having exactly this happen in his own handwritten div_floor function (causing some binning algorithm to try to write some data into the "negative first" bin).

But that was in C++, where UB reigns supreme.

varkor · 2018-04-24T00:03:59Z

For better or worse, it seems that virtually all possible implementations of floored or Euclidean modulus of floating point numbers naturally have this problem unless you explicitly account for this nasty edge case.

That's very interesting; I wasn't aware of that.

Regardless, this bug should definitely be fixed. I'd feel more comfortable addressing it along with a fix to div_euc, so we don't end up replacing one bad behaviour with another.

clamydo · 2018-04-24T08:40:31Z

This is also how it behaves in Python and in Haskell

True, one could argue that this is just how floating point arithmetic works. Floats certainly break many axioms, for example the uniqueness of the neutral element in the additive group: EPSILON + 10.0 = 10.0. But in a numerical context the non-uniqueness example feels more like an imprecision whereas a the modulus is just no element of the residue class it creates. To me, this feels 'more' wrong and I would suspect, it would introduce more subtle and fundamental bugs.

I'll admit that I do not give a strong argument. Maybe I just got used to only a subset of floating point oddities 😉.

Edit: Or to argue from another side: I'm okay for an operator like % to behave like that, but if there is a special method, that 'promises' to deliver a Euclidean modulo operation, it should to the right thing, IMHO.

scottmcm · 2018-04-24T16:56:45Z

a special method that 'promises' to deliver a Euclidean modulo operation should to the right thing

My thoughts exactly. The whole point of the method is to be the "good" version of %, so it should meet the euclidean definition strictly, even if that's more expensive.

ExpHP · 2018-04-24T19:08:25Z

I think that setting this precedent is something I can get behind; just be aware there is another closely-related edge case at -1e-18 % -1.0:

 pub fn mod_euc(self, rhs: f64) -> f64 {
     let mut r = self % rhs;
     if r < 0.0 {
         r += rhs.abs();
-        if r == rhs {
+        if r == rhs.abs() {
             return 0.0;
         }
     }
     r
 }

P.S. bear in mind the ultimate property of any div/mod function pair:

    a == a.div_euc(b) * b + a.mod_euc(b)

I just tested with Python's (flooring) division and float, and see that it generally satisfies this property for pairs of numbers uniformly drawn at random from (-1, 1), but quickly finds a counterexample when asked to draw numbers of wildly different magnitude.

import sys
import random

def uniform(amplitude):
    return amplitude * (2 * random.random() - 1)

amp_1 = float(sys.argv[1])
amp_2 = float(sys.argv[2])

while True:
    a = uniform(amp_1)
    b = uniform(amp_2)
    if a != (a // b) * b + (a % b):
        print(a, b)
        sys.exit(1)

$ python3 find-counterexample.py 1.0 1.0
(CPU spins up)
(time passes.........)
(okay, I'm bored)
^C
$ python3 find-counterexample.py 1.0 1e15
-0.7833003032084154 281330340541134.03
$ python3 find-counterexample.py 1e15 1.0
556555443188291.3 -0.047383938082182775

I wonder how well the corrected definitions fare at this? (my guess: probably still not well, and it might not even be possible to produce outputs satisfying this property for certain inputs. Not sure.)

fanzier · 2018-04-24T19:10:09Z

I'd like to contribute a different perspective: I consider the current behavior to be reasonable and correct but think we should document that the return value of mod_euc may equal the divisor due to precision problems.

Let me explain why I think so: Normally, the way floating point operation work (at least the basic operations: addition, subtraction, multiplication and division) is that they produce the floating point number that is closest to the mathematically correct result. For example, 3.0 - std::f64::EPSILON == 3.0 because 3.0 is the closest to the (irrepresentable) exact result.

Applying this to the example (-std::f64::EPSILON).mod_euc(3.0), we see that the mathematically exact result is 3.0 - std::f64::EPSILON which should be rounded – as before – to 3.0. If we used the alternate implementation, the result would be 0.0, which is about as far off from the mathematically correct result as it can be.

In addition, if we wanted to preserve the (desirable) property that a.div_euc(b) * b + a.mod_euc(b) is close to b, we would have to implement div_euc in such a way that (-std::f64::EPSILON).div_euc(3.0) == 0.0. But this is very confusing since the Euclidean quotient should always be negative if the dividend is negative and the divisor is positive.

But my main point is that deviating from the principle "floating point results should be as close as possible to the mathematically exact result" requires a really good reason. I don't see why the requirement a.mod_euc(b) != b should be such a good reason; in my opinion, it is not that important, as long as we have 0.0 <= a.mod_euc(b) <= b.abs(). The way how other languages handle this case seems to support my view.

a special method that 'promises' to deliver a Euclidean modulo operation should to the right thing

My thoughts exactly. The whole point of the method is to be the "good" version of %, so it should meet the euclidean definition strictly, even if that's more expensive.

I agree. I just think that the current version is the "best" version of % given that floating point is imprecise.

ExpHP · 2018-04-24T19:31:56Z

Applying this to the example (-std::f64::EPSILON).mod_euc(3.0), we see that the mathematically exact result is 3.0 - std::f64::EPSILON which should be rounded – as before – to 3.0. If we used the alternate implementation, the result would be 0.0, which is about as far off from the mathematically correct result as it can be.

I was about to argue against this, but after thinking about it, you're right.

Basically, I was going to say that the distance to the correct result should be measured in "periodic space;" in this sense, 0.0 is "just as correct" as 3.0, because in the quotient space Reals // 3.0 these two points are one and the same.

But then I realized the flaw in this argument. We can't measure error in periodic space, because otherwise the best implementations would clearly be the following:

fn div_euc(a: f64, b: f64) -> f64 { 0 }
fn mod_euc(a: f64, b: f64) -> f64 { a }

so our definition of error needs to be one that can differentiate between 0 and b.

I am just sharing in case somebody else were to try to make this argument.

varkor · 2018-04-24T20:48:35Z

If we used the alternate implementation, the result would be 0.0, which is about as far off from the mathematically correct result as it can be.

But then I realized the flaw in this argument. We can't measure error in periodic space, because otherwise the best implementations would clearly be the following:

I disagree: we should measure the error in the periodic range [0, 3.0), because that's the codomain of the operation. And it's precisely for this reason that 3.0 is absolutely incorrect — it's the closest floating point number in the entire domain of floating point numbers, but in the domain [0, 3.0), which is what mod_euc is defined to return, it's not even a valid number. 3.0 is an invalid output. The closest valid output to 3.0 is 0.0, because they are equivalent under the quotient relation, but only one of them is a valid output (that's literally the entire point of modulo).

as long as we have 0.0 <= a.mod_euc(b) <= b.abs()

We should have 0.0 <= a.mod_euc(b) < b.abs(). We can't just change the definition because floating point is frustrating.

I agree we also want a.div_euc(b) * b + a.mod_euc(b), but I don't see that being mutually exclusive.

fanzier · 2018-04-24T21:18:11Z

@varkor If our highest priority is that a.mod_euc(b) < b.abs(), then I agree with you.

I just think it's okay to weaken that requirement for floating points because they are not precise by their very nature.

To give an analogy: Mathematically, we have -PI / 2 < x.atan() < PI / 2 (note the "strictly less than"). However, we have std::f64::MAX.atan() == std::f64::consts::FRAC_PI_2 (proof), although it is technically not in the codomain of atan. For this function, it was apparently okay to extend the codomain from (-PI/2, PI/2) to [-PI/2,PI/2]. I argue that it is also okay to extend the codomain of mod_euc from [0,b.abs()) to [0,b.abs()].

I agree we also want a.div_euc(b) * b + a.mod_euc(b), but I don't see that being mutually exclusive.

So here we definitely have a problem because in the example above, we have to set (-std::f64::EPSILON).div_euc(3.0) == 0.0. The mathematically exact result is -1.0, which is far away from 0.0. In this case, this discrepancy is not defensible by periodicity because the the codomain is not periodic.

clamydo · 2018-04-25T08:13:28Z

I find @fanzier rather convincing. It is true, we cannot fix all edge cases floats introduce. But no matter the outfall, in my opinion the discussed property should be at least documented, because I find it surprising.

The reason it surprises me more than in the example -PI / 2 < x.atan() < PI / 2 is that PI / 2 is still asymptotically uniquely correct because atan is continuous. mod is not, so at the border/discontinuity it depends from which side you take the limit. In this sense, (-std::f64::EPSILON).mod_euc(3.0) == 3.0 is as asymptotically correct as == 0.0. I'd argue, that it makes more sense to resolve the ambiguity in favor of a valid output. But it definitely should not break a.div_euc(b) * b + a.mod_euc(b).

So my conclusion is, leave it as it is, but add some friendly documentation:

Due to limitations of the floating point format a.mod_euc(b) can evaluate to the mathematically invalid result b if a is much smaller in magnitude than b and a < 0.

If you agree, I can create a pull request.

clamydo · 2018-04-25T08:31:08Z

I just realized the property a == a.div_euc(b) * b + a.mod_euc(b) is also broken in the current implementation. Try

#![feature(euclidean_division)]

fn main() {
    let m = 3.0f64;
    let f = m - std::f64::EPSILON;

    let q = f.div_euc(m);
    let r = f.mod_euc(m);

    assert_eq!(f, m * r + q);
}

https://play.rust-lang.org/?gist=2a2664c0c1d80746cc3c00518499cfbd&version=nightly

It's like playing Whac-A-Mole.

fanzier · 2018-04-25T09:15:13Z

in my opinion the discussed property should be at least documented, because I find it surprising.

Definitely, I also said that in my first comment :)

Due to limitations of the floating point format a.mod_euc(b) can evaluate to the mathematically invalid result b if a is much smaller in magnitude than b and a < 0.

I think we should specify that we mean a.mod_euc(b) == b.abs() by "mathematically invalid". So I would prefer something along the lines of "While the mathematically exact result always satisfies a.mod_euc(b) < b.abs(), the rounded floating point result can be a.mod_euc(b) == b.abs() due to floating point precision problems if a is much smaller in magnitude than b and a < 0."

I just realized the property a == a.div_euc(b) * b + a.mod_euc(b) is also broken in the current implementation. Try (...)

No, it is not. (At least not in your example.) Note that f is set to exactly 3.0 in your example (due to rounding), so f.mod_euc(m) == 0.0 is completely correct.

hanna-kruppe · 2018-04-25T10:50:40Z

Perhaps it would help to examine the use cases this method is meant to address. If it is supposed to be "as close as possible" (as one usually wants for mathematical functions), it seems to me that sometimes the correctly rounded result will be rounded up to the divisor in the natural implementation (although one could make an argument that 0 is an equally correct result, it's not more correct). If, on the other hand, this method is intended to bring an input into a half-open range (e.g. because you're computing a function that is not defined everywhere, or has a discontinuity you'd rather avoid) then the result should clearly be strictly less than the divisor.

clamydo · 2018-04-25T13:21:40Z

No, it is not. (At least not in your example.) Note that f is set to exactly 3.0 in your example (due to rounding), so f.mod_euc(m) == 0.0 is completely correct.

I am very sorry, i had a bug in my code (or rather my brain).

#![feature(euclidean_division)]

fn main() {
    let m = 3.0f64;
    let f = m - std::f64::EPSILON;

    let q = f.div_euc(m);
    let r = f.mod_euc(m);

-   assert_eq!(f, m * r + q);
+   assert_eq!(f, m * q + r);
}

Indeed it yields the correct result.

clamydo · 2018-04-30T14:22:16Z

I have created a pull-request. Feel free to shout out any improvements.

…50179

Document round-off error in `.mod_euc()`-method, see issue rust-lang#50179 Due to a round-off error the method `.mod_euc()` of both `f32` and `f64` can produce mathematical invalid outputs. If `self` in magnitude is much small than the modulus `rhs` and negative, `self + rhs` in the first branch cannot be represented in the given precision and results into `rhs`. In the mathematical strict sense, this breaks the definition. But given the limitation of floating point arithmetic it can be thought of the closest representable value to the true result, although it is not strictly in the domain `[0.0, rhs)` of the function. It is rather the left side asymptotical limit. It would be desirable that it produces the mathematical more sound approximation of `0.0`, the right side asymptotical limit. But this breaks the property, that `self == self.div_euc(rhs) * rhs + a.mod_euc(rhs)`. The discussion in issue rust-lang#50179 did not find an satisfying conclusion to which property is deemed more important. But at least we can document the behaviour. Which this pull request does.

@ghost

Rollup of 7 pull requests Successful merges: - #49987 (Add str::split_ascii_whitespace.) - #50342 (Document round-off error in `.mod_euc()`-method, see issue #50179) - #51658 (Only do sanity check with debug assertions on) - #51799 (Lower case some feature gate error messages) - #51800 (Add a compiletest header for edition) - #51824 (Fix the error reference for LocalKey::try_with) - #51842 (Document that Layout::from_size_align does not allow align=0) Failed merges: r? @ghost

varkor · 2019-02-03T20:04:18Z

As far as I'm aware, mod_euc behaves as expected (and is documented as such) now.

clamydo changed the title ~~Bug, .mod_euc() produces invalid ouput for small negative floats~~ Bug in libstd, (unstable) .mod_euc() produces invalid ouput for small negative floats Apr 23, 2018

kennytm added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. C-bug Category: This is a bug. labels Apr 24, 2018

clamydo changed the title ~~Bug in libstd, (unstable) .mod_euc() produces invalid ouput for small negative floats~~ .mod_euc() produces invalid ouput for small negative floats due to rounding errors Apr 30, 2018

clamydo mentioned this issue Apr 30, 2018

Document round-off error in .mod_euc()-method, see issue #50179 #50342

Merged

kennytm pushed a commit to kennytm/rust that referenced this issue Jun 27, 2018

Document round-off error in .mod_euc()-method, see issue rust-lang#…

af6f0f2

…50179

kennytm mentioned this issue Jun 27, 2018

Rollup of 7 pull requests #51859

Merged

varkor closed this as completed Feb 3, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`.mod_euc()` produces invalid ouput for small negative floats due to rounding errors #50179

`.mod_euc()` produces invalid ouput for small negative floats due to rounding errors #50179

clamydo commented Apr 23, 2018 •

edited

Loading

clamydo commented Apr 23, 2018 •

edited

Loading

varkor commented Apr 23, 2018

ExpHP commented Apr 23, 2018 •

edited

Loading

ExpHP commented Apr 23, 2018 •

edited

Loading

varkor commented Apr 24, 2018

clamydo commented Apr 24, 2018 •

edited

Loading

scottmcm commented Apr 24, 2018

ExpHP commented Apr 24, 2018

fanzier commented Apr 24, 2018

ExpHP commented Apr 24, 2018

varkor commented Apr 24, 2018

fanzier commented Apr 24, 2018 •

edited

Loading

clamydo commented Apr 25, 2018 •

edited

Loading

clamydo commented Apr 25, 2018 •

edited

Loading

fanzier commented Apr 25, 2018

hanna-kruppe commented Apr 25, 2018

clamydo commented Apr 25, 2018 •

edited

Loading

clamydo commented Apr 30, 2018 •

edited

Loading

varkor commented Feb 3, 2019

.mod_euc() produces invalid ouput for small negative floats due to rounding errors #50179

.mod_euc() produces invalid ouput for small negative floats due to rounding errors #50179

Comments

clamydo commented Apr 23, 2018 • edited Loading

I expected to see this happen:

Instead, this happened:

Discussion

clamydo commented Apr 23, 2018 • edited Loading

varkor commented Apr 23, 2018

ExpHP commented Apr 23, 2018 • edited Loading

ExpHP commented Apr 23, 2018 • edited Loading

varkor commented Apr 24, 2018

clamydo commented Apr 24, 2018 • edited Loading

scottmcm commented Apr 24, 2018

ExpHP commented Apr 24, 2018

fanzier commented Apr 24, 2018

ExpHP commented Apr 24, 2018

varkor commented Apr 24, 2018

fanzier commented Apr 24, 2018 • edited Loading

clamydo commented Apr 25, 2018 • edited Loading

clamydo commented Apr 25, 2018 • edited Loading

fanzier commented Apr 25, 2018

hanna-kruppe commented Apr 25, 2018

clamydo commented Apr 25, 2018 • edited Loading

clamydo commented Apr 30, 2018 • edited Loading

varkor commented Feb 3, 2019

`.mod_euc()` produces invalid ouput for small negative floats due to rounding errors #50179

`.mod_euc()` produces invalid ouput for small negative floats due to rounding errors #50179

clamydo commented Apr 23, 2018 •

edited

Loading

clamydo commented Apr 23, 2018 •

edited

Loading

ExpHP commented Apr 23, 2018 •

edited

Loading

ExpHP commented Apr 23, 2018 •

edited

Loading

clamydo commented Apr 24, 2018 •

edited

Loading

fanzier commented Apr 24, 2018 •

edited

Loading

clamydo commented Apr 25, 2018 •

edited

Loading

clamydo commented Apr 25, 2018 •

edited

Loading

clamydo commented Apr 25, 2018 •

edited

Loading

clamydo commented Apr 30, 2018 •

edited

Loading