Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

round ties behaviour #8750

Closed
simonbyrne opened this issue Oct 21, 2014 · 25 comments
Closed

round ties behaviour #8750

simonbyrne opened this issue Oct 21, 2014 · 25 comments

Comments

@simonbyrne
Copy link
Contributor

simonbyrne commented Oct 21, 2014

This has come up before (#5983), but probably deserves its own issue. How should round handle ties, that is, values with fractional part of 1/2.

The two most common options are:
a) round ties away from zero: round(0.5) = 1.0, round(1.5) = 2.0.
b) round ties to even (aka banker's rounding): round(0.5) = 0.0, round(1.5) = 2.0.

LLVM provides an intrinsic for (a), but this typically requires multiple CPU instructions, and doesn't vectorize (see #8364). On the other hand (b) often corresponds to a native CPU instruction (e.g. table 4-15 of the Intel developer manual), but I don't know if LLVM offers a way to access it. (UPDATE: rint can be used, but depends on rounding mode).

The IEEE spec requires that we offer both, but doesn't have anything to say about names or interface.

Here is what other languages do:

Language a) Ties away from zero b) Ties to even Other
C (1999+) / C++ (2007+) round and variants via rint/nearbyint (depends on global rounding mode) , roundeven in future
Python 2.x: round 3+: round
Matlab/Octave round
R round
Mathematica Round
.NET (C#, F#, VB) via optional argument Math.Round
Java round: ties to +Inf
JavaScript round: ties to +Inf
Fortran (77+) ANINT
ALGOL ROUND (though not specified in standard)
Common Lisp fround
Scheme round
Emacs lisp fround
Haskell round
OCaml does not have a round function
Clojure round: ties to +Inf
Erlang round
Racket round
Ruby round
Rust round
Go does not have a Round function
Lua does not have a round function
Pascal ISO: round Free Pascal: round GNU Pascal: Round machine dependent
Modula 2 round implementation defined.
Perl does not have a round function
PHP round via optional argument other modes available (toward zero, to odd)
OpenGL roundEven round is up to implementation (presumably whatever is fastest).
OpenCL round rint (regardless of rounding mode)
Ada Rounding Unbiased_Rounding Machine_Rounding: whatever is most efficient.
APL No round?
AppleScript via rounding as taught in school option round
SAS ROUND ROUNDE
PostScript round
COBOL ROUNDED clause via options
D round (changed in dlang/phobos@91c38b4)
Forth ISO: FROUND
Tcl round (clarified in bug report)
Smalltalk rounded
Prolog SICStus: round ISO: round ties to +Inf
Delphi Round
Xpath round-half-to-even round: ties to +Inf
SQL ROUND (mostly, though varies)

My summary:

  • Older languages typically use (a).
  • Those without a focus on floating point either copy C or don't implement it.
  • Lisps and functional languages seem to prefer (b).
  • Java and JavaScript are bizarre (particularly Java pre-v7).
@davidssmith
Copy link
Contributor

FWIW, I suspect most scientific users will expect (a) to happen.

It would be nice if iround and round were consistent.

@simonbyrne
Copy link
Contributor Author

@davidssmith I guess if you define "scientific users" as C/Fortran/Matlab. Mathematica/R/Python 3 users might disagree.

@davidssmith
Copy link
Contributor

Wow, interesting that Python is changing their rounding behavior.

I guess I meant people who learned rounding in math class, not CS class. I think round should keep doing (a), and we should retain intrinsics to do (b), preferably using a native LLVM instruction should it become available. I often have inner loops where rounding is occurring and could use either behavior, especially if one was much faster.

@timholy
Copy link
Member

timholy commented Dec 3, 2014

@simonbyrne: first, that table is amazing. Really nice work. Second, when it comes to floating point, I'd just do whatever @simonbyrne told me to do...oh, right, that's you. Seriously, I'd put a lot of faith in whatever you think is best, but I tend to generally think Julia should usually follow whatever is implemented by the hardware (unless there are strong reasons to do otherwise).

@ViralBShah
Copy link
Member

That table really is amazing. It should perhaps be added to the wikipedia entry on Rounding. I would have found (a) to be more intuitive, and wouldn't have guessed that so many systems choose (b). I also would have naively expected that (a) and not (b) would have a native CPU instruction.

FWIW, the R page says the following - Note that for rounding off a 5, the IEC 60559 standard is expected to be used, ‘go to the even digit’.

Cc: @alanedelman @ArchRobison

@simonbyrne
Copy link
Contributor Author

It would be great to know why Python decided to change. I had a look through the python-dev archives, but couldn't find any discussion as to how it was decided. If anyone knows more, I would certainly be interested to hear it.

The classic argument for (b) is that it is unbiased: if you're summing or averaging a bunch of rounded numbers, you won't introduce an unexpected bias. It's typical in stats (I was taught this way in my first-year undergrad stats class), which is probably why R does it that way.

@ViralBShah
Copy link
Member

Apparently, the ISO 60559 standard is identical to IEEE 754.

@ViralBShah
Copy link
Member

@simonbyrne That does sound intuitive. The R manual claims that the IEEE standard requires (b). The Wikipedia page on IEEE seems to be consistent with what you say - provide both modes, but no names or defaults are discussed.

@simonbyrne
Copy link
Contributor Author

I tried to create a benchmark here:
https://gist.github.com/simonbyrne/1e4f5234eaa70725c30f

This compares the native instruction rint_llvm, doing (b) with two competing implementations of round, which does (a) (see discussion of #8364). I'm using LLVM 3.3, and it didn't appear to use the vectorised instructions. The TLDR: (a) can be done basically as fast as (b), at least on my setup.

@simonbyrne
Copy link
Contributor Author

And to answer my own question about LLVM support: looking at the source, it seems they don't currently expose ties-to-even instruction in a way that doesn't depend on rounding mode (if they did, the appropriate code would be 0x0).

@simonbyrne
Copy link
Contributor Author

I emailed Jeffrey Yasskin asking about the reason for the Python change, he responded:

Here's the email that decided it for Python:
https://mail.python.org/pipermail/python-dev/2008-January/075910.html
Here's a paper that discusses it a bit in the context of C++:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3352.html#basic_operations

@StefanKarpinski
Copy link
Member

Can't we just have an optional rounding mode argument to round that defaults to RoundFromZero? I find the numerical arguments for breaking ties to even a bit questionable and both of the motivations given in that Python email seem to be the same reason to me.

@simonbyrne
Copy link
Contributor Author

I don't find these numerical arguments terribly convincing either. They do make sense in the context of ordinary floating point operations (which is what the C++ link was referring to), but when you call round you're intentionally losing precision, so I don't think it really makes sense to talk about accuracy. While the statistician in me appreciates unbiasedness, it's clear that there are valid reasons for other behaviour: I even came across a reasonable argument for Prolog/Java/JavaScript-style rounding involving grid interpolation.

Now we can dispatch on rounding mode, it should be fairly easy to implement them all via optional arguments to round, so the main question is what should be the default?

The main reasons for (a):

  • For most people, it's how they learnt it in school (cf AppleScript).
  • It's what C--and all languages which use the C libm--does.

The main reasons for (b):

  • It's faster on any SSE4.1 (i.e. Intel post-2008, AMD post-2011) processor (see [1]).
  • It's conceptually elegant, in that "round to nearest integer" means the same thing as "round to nearest floating point value".

I would lean toward (b) purely for performance reasons, but am not going to be disheartened if we go the other way.

[1] The "prevfloat" trick I used for the fast benchmark for (a) isn't strictly correct according to the IEEE standard, as it will misbehave under different global rounding modes, and incorrectly raise Inexact floating point exceptions, which I suspect is why it isn't used in any library. Also, the copysign seemed to prevent it from vectorizing, though I imagine that this should be fixable by LLVM.

@vchuravy
Copy link
Member

vchuravy commented Dec 5, 2014

Isn't (b) also defined as the standard rounding behaviour in IEEE754-2008 (at least according to Wikipedia it is)
I would stick to IEEE754 as close as possible and expose the other rounding behaviours as options and round-to-even as default

@simonbyrne
Copy link
Contributor Author

@vchuravy No, in section 5.9 (or for free, see this late draft) it clearly says to offer:

  • roundToIntegralTiesToEven: (b) above
  • roundToIntegralTowardZero: trunc in Julia
  • roundToIntegralTowardPositive: ceil
  • roundToIntegralTowardNegative: floor
  • roundToIntegralTiesToAway: (a) above
  • roundToIntegralExact: not exported, but available via Base.Math.rint (v0.4 only)

@vchuravy
Copy link
Member

vchuravy commented Dec 5, 2014

@simonbyrne Do I get it right that Float64 and Float32 are equivalent to the binary64 and binary32 formats in IEEE754-2008? If so then section 4.3.3 defines a default rounding operation attribute:

The roundTiesToEven rounding-direction attribute shall be the default rounding-direction attribute for results in binary formats.

Edit:
I see the difference between rounding formats and rounding operations, but I take the statement in 4.3.3 at least as a recommendation to use roundTiesToEven as a default operation.

@simonbyrne
Copy link
Contributor Author

@vchuravy In general, when the standard refers to "rounding" (as in Section 4.3), it means "rounding to fit in the destination format", i.e. after some "computational operation" you have a real number (say 1/3), which then has to be "rounded" to a value that is representable by the format (0.3333333333333333 in the case of Float64). You're correct in that by default, this is "round to nearest, ties to even" (RoundNearest in Julia), though you can change that with set_rounding (be warned though that strange things may happen if you do).

This is distinct from the round function, which is referred to in the standard as "rounding a floating-point number to an integral value".

@simonbyrne
Copy link
Contributor Author

FWIW, C in future will have a roundeven function, according to TS 18661-1:2014 (free draft version).

@vchuravy
Copy link
Member

vchuravy commented Dec 5, 2014

@simonbyrne thanks for the explanation. If we are voting I would vote for round-to-even. :)

@ArchRobison
Copy link
Contributor

The entire problem could be avoided by using ternary floating point. (I don't know whether Setsun had floating point or not.) . Seriously, I have a slight preference for round-to-even since that's what the modern hardware does best.

@StefanKarpinski
Copy link
Member

My vote is for keeping round C-compatible and having roundeven that does what the hardward does (i.e. round ties to even). That way our names/behaviors are as close to C as possible.

@simonbyrne
Copy link
Contributor Author

If we were to change the behaviour of round, here is my slightly more concrete proposal:

  • round(x) could be "round to nearest integer under current rounding mode" (i.e. C rint)
  • round(x,RoundNearestTiesAway) would give current round (C) behaviour
  • round(x,RoundNearestTiesUp) would give Java round behaviour (because we can, and someone will want it at some stage).
  • we could also do the same for RoundUp, RoundDown and RoundToZero, as aliases to ceil, floor and trunc (though I wouldn't propose deprecating those, unless we're going for "minimal Julia").
  • once LLVM get around to exporting a roundeven intrinsic, we can implement a general round(x,RoundNearest) as well.

@johnmyleswhite
Copy link
Member

I like @simonbyrne's proposal.

@StefanKarpinski
Copy link
Member

I'm on board with this proposal too.

@ViralBShah
Copy link
Member

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants