Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.mod_euc() produces invalid ouput for small negative floats due to rounding errors #50179

Closed
clamydo opened this issue Apr 23, 2018 · 19 comments
Closed
Labels
C-bug Category: This is a bug. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@clamydo
Copy link
Contributor

clamydo commented Apr 23, 2018

The newly added method std::f64::mod_euc() can produce values equal to the modulus when applied to a small negative float and sufficiently big modulus. AFAIK this should never happen.

I tried this code on nightly rustc 1.27.0-nightly (ac3c2288f 2018-04-18) and the version on playground:

#![feature(euclidean_division)]

fn main() {
    let f = -std::f64::EPSILON;
    let m = 3.0f64;

    assert!(f.mod_euc(m) < m);
}

https://play.rust-lang.org/?gist=01a506080183f14067553856de6763ab&version=nightly

I expected to see this happen:

The operation should produce only values in between 0. <= x < modulos.

Instead, this happened:

But it produced a value equal to the modulus, in this case 3.. This is true for all sufficiently big floats relative to the machine epsilon.

Discussion

It is likely a bug in the implementation ignoring some floating point arithmetic oddities. I haven't looked deeper yet.

@clamydo
Copy link
Contributor Author

clamydo commented Apr 23, 2018

It is currently implemented as

pub fn mod_euc(self, rhs: f64) -> f64 {
    let r = self % rhs;
    if r < 0.0 {
        r + rhs.abs()
    } else {
        r
    }
}

https://github.com/rust-lang/rust/blob/master/src/libstd/f64.rs#L236

What happens is that because -EPSILON is negative the modulus is added to it. Which still results in the modulus when it is big enough, because the operation is not representable in the given precision.

A more stable implementation would be, I believe, for example something like

pub fn mod_euc(self, rhs: f64) -> f64 {
    ((self % rhs) + rhs.abs()) % rhs
}

@clamydo clamydo changed the title Bug, .mod_euc() produces invalid ouput for small negative floats Bug in libstd, (unstable) .mod_euc() produces invalid ouput for small negative floats Apr 23, 2018
@varkor
Copy link
Member

varkor commented Apr 23, 2018

Thanks @fkjogu for the investigation, and good catch!
I wonder whether:

pub fn mod_euc(self, rhs: f64) -> f64 {
    let mut r = self % rhs;
    if r < 0.0 {
        r += rhs.abs();
        if r == rhs {
            return 0.0;
        }
    }
    r
}

would be a slightly better implementation, as it avoids the extra unnecessary division from the second %.

Correspondingly, I imagine div_euc is probably incorrect. The floating-point implementations were added later in the RFC, so they probably didn't get quite as much examination as the integer implementations...

@ExpHP
Copy link
Contributor

ExpHP commented Apr 23, 2018

This is also how it behaves in Python:

>>> -1e-18 % 1
1.0

and in Haskell:

λ> import Data.Fixed
λ> (-1e-18) `mod'` 1.0
1.0

For better or worse, it seems that virtually all possible implementations of floored or Euclidean modulus of floating point numbers naturally have this problem unless you explicitly account for this nasty edge case.

@ExpHP
Copy link
Contributor

ExpHP commented Apr 23, 2018

To play devil's advocate though, a colleague of mine has shared his story of the worst bug he ever had to debug, which was basically caused by having exactly this happen in his own handwritten div_floor function (causing some binning algorithm to try to write some data into the "negative first" bin).

But that was in C++, where UB reigns supreme.

@varkor
Copy link
Member

varkor commented Apr 24, 2018

For better or worse, it seems that virtually all possible implementations of floored or Euclidean modulus of floating point numbers naturally have this problem unless you explicitly account for this nasty edge case.

That's very interesting; I wasn't aware of that.

Regardless, this bug should definitely be fixed. I'd feel more comfortable addressing it along with a fix to div_euc, so we don't end up replacing one bad behaviour with another.

@kennytm kennytm added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. C-bug Category: This is a bug. labels Apr 24, 2018
@clamydo
Copy link
Contributor Author

clamydo commented Apr 24, 2018

This is also how it behaves in Python and in Haskell

True, one could argue that this is just how floating point arithmetic works. Floats certainly break many axioms, for example the uniqueness of the neutral element in the additive group: EPSILON + 10.0 = 10.0. But in a numerical context the non-uniqueness example feels more like an imprecision whereas a the modulus is just no element of the residue class it creates. To me, this feels 'more' wrong and I would suspect, it would introduce more subtle and fundamental bugs.

I'll admit that I do not give a strong argument. Maybe I just got used to only a subset of floating point oddities 😉.

Edit: Or to argue from another side: I'm okay for an operator like % to behave like that, but if there is a special method, that 'promises' to deliver a Euclidean modulo operation, it should to the right thing, IMHO.

@scottmcm
Copy link
Member

a special method that 'promises' to deliver a Euclidean modulo operation should to the right thing

My thoughts exactly. The whole point of the method is to be the "good" version of %, so it should meet the euclidean definition strictly, even if that's more expensive.

@ExpHP
Copy link
Contributor

ExpHP commented Apr 24, 2018

I think that setting this precedent is something I can get behind; just be aware there is another closely-related edge case at -1e-18 % -1.0:

 pub fn mod_euc(self, rhs: f64) -> f64 {
     let mut r = self % rhs;
     if r < 0.0 {
         r += rhs.abs();
-        if r == rhs {
+        if r == rhs.abs() {
             return 0.0;
         }
     }
     r
 }

P.S. bear in mind the ultimate property of any div/mod function pair:

    a == a.div_euc(b) * b + a.mod_euc(b)

I just tested with Python's (flooring) division and float, and see that it generally satisfies this property for pairs of numbers uniformly drawn at random from (-1, 1), but quickly finds a counterexample when asked to draw numbers of wildly different magnitude.

import sys
import random

def uniform(amplitude):
    return amplitude * (2 * random.random() - 1)

amp_1 = float(sys.argv[1])
amp_2 = float(sys.argv[2])

while True:
    a = uniform(amp_1)
    b = uniform(amp_2)
    if a != (a // b) * b + (a % b):
        print(a, b)
        sys.exit(1)
$ python3 find-counterexample.py 1.0 1.0
(CPU spins up)
(time passes.........)
(okay, I'm bored)
^C
$ python3 find-counterexample.py 1.0 1e15
-0.7833003032084154 281330340541134.03
$ python3 find-counterexample.py 1e15 1.0
556555443188291.3 -0.047383938082182775

I wonder how well the corrected definitions fare at this? (my guess: probably still not well, and it might not even be possible to produce outputs satisfying this property for certain inputs. Not sure.)

@fanzier
Copy link
Contributor

fanzier commented Apr 24, 2018

I'd like to contribute a different perspective: I consider the current behavior to be reasonable and correct but think we should document that the return value of mod_euc may equal the divisor due to precision problems.

Let me explain why I think so: Normally, the way floating point operation work (at least the basic operations: addition, subtraction, multiplication and division) is that they produce the floating point number that is closest to the mathematically correct result. For example, 3.0 - std::f64::EPSILON == 3.0 because 3.0 is the closest to the (irrepresentable) exact result.

Applying this to the example (-std::f64::EPSILON).mod_euc(3.0), we see that the mathematically exact result is 3.0 - std::f64::EPSILON which should be rounded – as before – to 3.0. If we used the alternate implementation, the result would be 0.0, which is about as far off from the mathematically correct result as it can be.

In addition, if we wanted to preserve the (desirable) property that a.div_euc(b) * b + a.mod_euc(b) is close to b, we would have to implement div_euc in such a way that (-std::f64::EPSILON).div_euc(3.0) == 0.0. But this is very confusing since the Euclidean quotient should always be negative if the dividend is negative and the divisor is positive.

But my main point is that deviating from the principle "floating point results should be as close as possible to the mathematically exact result" requires a really good reason. I don't see why the requirement a.mod_euc(b) != b should be such a good reason; in my opinion, it is not that important, as long as we have 0.0 <= a.mod_euc(b) <= b.abs(). The way how other languages handle this case seems to support my view.

a special method that 'promises' to deliver a Euclidean modulo operation should to the right thing

My thoughts exactly. The whole point of the method is to be the "good" version of %, so it should meet the euclidean definition strictly, even if that's more expensive.

I agree. I just think that the current version is the "best" version of % given that floating point is imprecise.

@ExpHP
Copy link
Contributor

ExpHP commented Apr 24, 2018

Applying this to the example (-std::f64::EPSILON).mod_euc(3.0), we see that the mathematically exact result is 3.0 - std::f64::EPSILON which should be rounded – as before – to 3.0. If we used the alternate implementation, the result would be 0.0, which is about as far off from the mathematically correct result as it can be.

I was about to argue against this, but after thinking about it, you're right.

Basically, I was going to say that the distance to the correct result should be measured in "periodic space;" in this sense, 0.0 is "just as correct" as 3.0, because in the quotient space Reals // 3.0 these two points are one and the same.

But then I realized the flaw in this argument. We can't measure error in periodic space, because otherwise the best implementations would clearly be the following:

fn div_euc(a: f64, b: f64) -> f64 { 0 }
fn mod_euc(a: f64, b: f64) -> f64 { a }

so our definition of error needs to be one that can differentiate between 0 and b.

I am just sharing in case somebody else were to try to make this argument.

@varkor
Copy link
Member

varkor commented Apr 24, 2018

If we used the alternate implementation, the result would be 0.0, which is about as far off from the mathematically correct result as it can be.

But then I realized the flaw in this argument. We can't measure error in periodic space, because otherwise the best implementations would clearly be the following:

I disagree: we should measure the error in the periodic range [0, 3.0), because that's the codomain of the operation. And it's precisely for this reason that 3.0 is absolutely incorrect — it's the closest floating point number in the entire domain of floating point numbers, but in the domain [0, 3.0), which is what mod_euc is defined to return, it's not even a valid number. 3.0 is an invalid output. The closest valid output to 3.0 is 0.0, because they are equivalent under the quotient relation, but only one of them is a valid output (that's literally the entire point of modulo).

as long as we have 0.0 <= a.mod_euc(b) <= b.abs()

We should have 0.0 <= a.mod_euc(b) < b.abs(). We can't just change the definition because floating point is frustrating.

I agree we also want a.div_euc(b) * b + a.mod_euc(b), but I don't see that being mutually exclusive.

@fanzier
Copy link
Contributor

fanzier commented Apr 24, 2018

@varkor If our highest priority is that a.mod_euc(b) < b.abs(), then I agree with you.

I just think it's okay to weaken that requirement for floating points because they are not precise by their very nature.

To give an analogy: Mathematically, we have -PI / 2 < x.atan() < PI / 2 (note the "strictly less than"). However, we have std::f64::MAX.atan() == std::f64::consts::FRAC_PI_2 (proof), although it is technically not in the codomain of atan. For this function, it was apparently okay to extend the codomain from (-PI/2, PI/2) to [-PI/2,PI/2]. I argue that it is also okay to extend the codomain of mod_euc from [0,b.abs()) to [0,b.abs()].

I agree we also want a.div_euc(b) * b + a.mod_euc(b), but I don't see that being mutually exclusive.

So here we definitely have a problem because in the example above, we have to set (-std::f64::EPSILON).div_euc(3.0) == 0.0. The mathematically exact result is -1.0, which is far away from 0.0. In this case, this discrepancy is not defensible by periodicity because the the codomain is not periodic.

@clamydo
Copy link
Contributor Author

clamydo commented Apr 25, 2018

I find @fanzier rather convincing. It is true, we cannot fix all edge cases floats introduce. But no matter the outfall, in my opinion the discussed property should be at least documented, because I find it surprising.

The reason it surprises me more than in the example -PI / 2 < x.atan() < PI / 2 is that PI / 2 is still asymptotically uniquely correct because atan is continuous. mod is not, so at the border/discontinuity it depends from which side you take the limit. In this sense, (-std::f64::EPSILON).mod_euc(3.0) == 3.0 is as asymptotically correct as == 0.0. I'd argue, that it makes more sense to resolve the ambiguity in favor of a valid output. But it definitely should not break a.div_euc(b) * b + a.mod_euc(b).

So my conclusion is, leave it as it is, but add some friendly documentation:

Due to limitations of the floating point format a.mod_euc(b) can evaluate to the mathematically invalid result b if a is much smaller in magnitude than b and a < 0.

If you agree, I can create a pull request.

@clamydo
Copy link
Contributor Author

clamydo commented Apr 25, 2018

I just realized the property a == a.div_euc(b) * b + a.mod_euc(b) is also broken in the current implementation. Try

#![feature(euclidean_division)]

fn main() {
    let m = 3.0f64;
    let f = m - std::f64::EPSILON;

    let q = f.div_euc(m);
    let r = f.mod_euc(m);

    assert_eq!(f, m * r + q);
}

https://play.rust-lang.org/?gist=2a2664c0c1d80746cc3c00518499cfbd&version=nightly

It's like playing Whac-A-Mole.

@fanzier
Copy link
Contributor

fanzier commented Apr 25, 2018

in my opinion the discussed property should be at least documented, because I find it surprising.

Definitely, I also said that in my first comment :)

Due to limitations of the floating point format a.mod_euc(b) can evaluate to the mathematically invalid result b if a is much smaller in magnitude than b and a < 0.

I think we should specify that we mean a.mod_euc(b) == b.abs() by "mathematically invalid". So I would prefer something along the lines of "While the mathematically exact result always satisfies a.mod_euc(b) < b.abs(), the rounded floating point result can be a.mod_euc(b) == b.abs() due to floating point precision problems if a is much smaller in magnitude than b and a < 0."

I just realized the property a == a.div_euc(b) * b + a.mod_euc(b) is also broken in the current implementation. Try (...)

No, it is not. (At least not in your example.) Note that f is set to exactly 3.0 in your example (due to rounding), so f.mod_euc(m) == 0.0 is completely correct.

@hanna-kruppe
Copy link
Contributor

Perhaps it would help to examine the use cases this method is meant to address. If it is supposed to be "as close as possible" (as one usually wants for mathematical functions), it seems to me that sometimes the correctly rounded result will be rounded up to the divisor in the natural implementation (although one could make an argument that 0 is an equally correct result, it's not more correct). If, on the other hand, this method is intended to bring an input into a half-open range (e.g. because you're computing a function that is not defined everywhere, or has a discontinuity you'd rather avoid) then the result should clearly be strictly less than the divisor.

@clamydo
Copy link
Contributor Author

clamydo commented Apr 25, 2018

No, it is not. (At least not in your example.) Note that f is set to exactly 3.0 in your example (due to rounding), so f.mod_euc(m) == 0.0 is completely correct.

I am very sorry, i had a bug in my code (or rather my brain).

#![feature(euclidean_division)]

fn main() {
    let m = 3.0f64;
    let f = m - std::f64::EPSILON;

    let q = f.div_euc(m);
    let r = f.mod_euc(m);

-   assert_eq!(f, m * r + q);
+   assert_eq!(f, m * q + r);
}

Indeed it yields the correct result.

@clamydo clamydo changed the title Bug in libstd, (unstable) .mod_euc() produces invalid ouput for small negative floats .mod_euc() produces invalid ouput for small negative floats due to rounding errors Apr 30, 2018
@clamydo
Copy link
Contributor Author

clamydo commented Apr 30, 2018

I have created a pull-request. Feel free to shout out any improvements.

kennytm added a commit to kennytm/rust that referenced this issue Jun 27, 2018
Document round-off error in `.mod_euc()`-method, see issue rust-lang#50179

Due to a round-off error the method `.mod_euc()` of both `f32` and `f64` can produce mathematical invalid outputs. If `self` in magnitude is much small than the modulus `rhs` and negative, `self + rhs` in the first branch cannot be represented in the given precision and results into `rhs`. In the mathematical strict sense, this breaks the definition. But given the limitation of floating point arithmetic it can be thought of the closest representable value to the true result, although it is not strictly in the domain `[0.0, rhs)` of the function. It is rather the left side asymptotical limit. It would be desirable that it produces the mathematical more sound approximation of `0.0`, the right side asymptotical limit. But this breaks the property, that `self == self.div_euc(rhs) * rhs + a.mod_euc(rhs)`.

The discussion in issue rust-lang#50179 did not find an satisfying conclusion to which property is deemed more important. But at least we can document the behaviour. Which this pull request does.
bors added a commit that referenced this issue Jun 27, 2018
Rollup of 7 pull requests

Successful merges:

 - #49987 (Add str::split_ascii_whitespace.)
 - #50342 (Document round-off error in `.mod_euc()`-method, see issue #50179)
 - #51658 (Only do sanity check with debug assertions on)
 - #51799 (Lower case some feature gate error messages)
 - #51800 (Add a compiletest header for edition)
 - #51824 (Fix the error reference for LocalKey::try_with)
 - #51842 (Document that Layout::from_size_align does not allow align=0)

Failed merges:

r? @ghost
@varkor
Copy link
Member

varkor commented Feb 3, 2019

As far as I'm aware, mod_euc behaves as expected (and is documented as such) now.

@varkor varkor closed this as completed Feb 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants