Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double doesn't Parse correctly #10110

Closed
cyberphone opened this issue Apr 7, 2018 · 27 comments
Closed

Double doesn't Parse correctly #10110

cyberphone opened this issue Apr 7, 2018 · 27 comments

Comments

@cyberphone
Copy link

While doing some mathematical stuff that worked fine in Java I found that C# / .NET does a rather poor job on parsing double values:

Number2Parse=4.3413853813852797e+192 IEEE754=67ee7316b1545878 C#=4.3413853813852797E+192
Number2Parse=4.34138538138528e+192 IEEE754=67ee7316b1545878 C#=4.3413853813852797E+192

You can verify in any browser that 4.3413853813852797e+192 and 4.34138538138528e+192 are distinct values. Version: CLR v4.0.30319

using System;

namespace mathbug
{
    class Program
    {
        static void ShowNumber(string number2Parse)
        {
            double d = Double.Parse(number2Parse);
            ulong ieee754 = (ulong)BitConverter.DoubleToInt64Bits(d);
            Console.WriteLine("Number2Parse=" + number2Parse + " IEEE754=" + Convert.ToString((long)ieee754, 16) + " C#=" + d.ToString("G17"));
        }
 
        static void Main(string[] args)
        {
            ShowNumber("4.3413853813852797e+192");
            ShowNumber("4.34138538138528e+192");
        }
    }
}
@danmoseley
Copy link
Member

@tannergooding is this expected?

@cyberphone
Copy link
Author

This is what a Java equivalent does:

Number2Parse=4.3413853813852797e+192 IEEE754=67ee7316b1545878 Java=4.3413853813852797E192
Number2Parse=4.34138538138528e+192 IEEE754=67ee7316b1545879 Java=4.34138538138528E192

@tannergooding
Copy link
Member

@danmosemsft, no. There was quite a bit of work to make double.ToString accurate (and more performant).

However, it looks like there is a bug in the reverse case (converting string to double). The relevant code is here: https://source.dot.net/#System.Private.CoreLib/shared/System/Number.Parsing.cs,458977730d4a33f6, but I haven't dug in to find out what is wrong.

@Suchiman
Copy link
Contributor

Suchiman commented Apr 7, 2018

On my machines (i5-4670K, i5-6300U), on Debug and Release with netcoreapp2.0, netcoreapp2.1 and net4.7.1 this actually produces

Number2Parse=4.3413853813852797e+192 IEEE754=6b40e724b8d6a5ee C#=4,34138538138528E+208
Number2Parse=4.34138538138528e+192 IEEE754=6ad5a2be5d3bb5ac C#=4,3413853813852803E+206

@AlexGhiondea
Copy link
Contributor

@tarekgh could you help with this one?

@tannergooding
Copy link
Member

tannergooding commented Apr 12, 2018

@Suchiman, what locale are you? I'd like to try and reproduce your issue, since it doesn't appear to reproduce under my default.

@tarekgh
Copy link
Member

tarekgh commented Apr 12, 2018

@cyberphone would be nice if you can share more info regarding your environment. I mean OS version, you user locale, your machine architecture. we just need to repro it first and then we can try to help to tell what is going on.

@tannergooding
Copy link
Member

tannergooding commented Apr 12, 2018

@tarekgh, this repros on a clean Windows 10, English machine. There is definitely a bug somewhere in our the CoreCLR native parsing layer (see https://github.com/dotnet/coreclr/issues/17467#issuecomment-379502246), but I didn't dig further into it.

@cyberphone
Copy link
Author

@tarekgh I have Windows 10 Pro, Version 1709, Build 16299.371, US Locale
running on a Dell XPS laptop with i7-4702HQ processor

@tarekgh
Copy link
Member

tarekgh commented Apr 12, 2018

@cyberphone Thanks, I am able to repo it too now.

@jkotas
Copy link
Member

jkotas commented Apr 12, 2018

@tarekgh
Copy link
Member

tarekgh commented Apr 12, 2018

Thanks, @jkotas for pointing to the other issue. I am seeing the direction is to port the parsing code from Roslyn to coreclr as mentioned in the comment https://github.com/dotnet/coreclr/issues/13615#issuecomment-326369056, I'll move this issue to the future milestone and leave it open to track porting the parsing code.

@tarekgh tarekgh removed their assignment Apr 12, 2018
@tarekgh
Copy link
Member

tarekgh commented Apr 12, 2018

A quick note too, I am seeing Roslyn implementation is using BigInteger class which I am not sure if there will be a perf issue there or we need to get optimized version.

@jkotas
Copy link
Member

jkotas commented Apr 12, 2018

I am seeing the direction is to port the parsing code from Roslyn to coreclr

This is one possible direction. I think we should do our homework similar to what we have done for formatting. See what is the state of the art out there, and then pick the best option.

@tannergooding
Copy link
Member

A quick note too, I am seeing Roslyn implementation is using BigInteger class which I am not sure if there will be a perf issue there or we need to get optimized version.

The CoreCLR native code uses a highly optimized (and limited) BigInteger implementation

@cyberphone
Copy link
Author

cyberphone commented Apr 13, 2018

@tarekgh Delete the above,I erred :-(

BTW, I have a file with 100 million random and specific test values. This is how I found the problem. My application depends on perfect "roundtrips" including compatibility with ES6/V8.

This is my application:
https://cyberphone.github.io/doc/security/draft-rundgren-json-canonicalization-scheme.html#json.ser.number

It would be great if there was a Double.ToString("ES") method.

@Suchiman
Copy link
Contributor

@tannergooding Windows 10 Pro 1709 de-DE.

@cyberphone
Copy link
Author

@Suchiman My simple test program didn't take locale in consideration so '.' doesn't get interpreted as expected in DE.

Double.Parse(number2Parse) must be changed to use invariant locale. Probably like:
Double.Parse(number2Parse, NumberStyles.Float, CultureInfo.InvariantCulture)

@Suchiman
Copy link
Contributor

@cyberphone Wow 🤦‍♂️ i didn't pay attention to that, with , instead of . it gets the result described above 👍

@cyberphone
Copy link
Author

cyberphone commented Apr 27, 2018

In case you are interested in 100M test values you may download the following file:
https://onedrive.live.com/embed?cid=9341770E0D0D5468&resid=9341770E0D0D5468%21222&authkey=ADOClRsuPv3_pTk

Associated test program:

using System;
using System.IO;
using System.Globalization;

namespace mathbug
{
    class Program
    {
        static void Main(string[] args)
        {
            using (StreamReader sr = new StreamReader("c:\\es6\\numbers\\es6testfile100m.txt"))
            {
                string line;
                // Read and display lines from the file until the end of 
                // the file is reached.
                long counter = 0;
                while ((line = sr.ReadLine()) != null)
                {
                    string origIeeeHex = line.Substring(0, line.IndexOf(','));
                    while (origIeeeHex.Length < 16)
                    {
                        origIeeeHex = '0' + origIeeeHex;
                    }
                    ulong origIeeeBin = Convert.ToUInt64(origIeeeHex, 16);
                    double origIeee = BitConverter.Int64BitsToDouble((long)origIeeeBin);
                    string number2Parse = line.Substring(line.IndexOf(',') + 1);
                    if (++counter % 100000 == 0)
                    {
                        Console.WriteLine("Count=" + counter);
                    }
                    double parsedIeee = double.Parse(number2Parse, NumberStyles.Float, CultureInfo.InvariantCulture);
                    String parsedIeeeHex = Convert.ToString(BitConverter.DoubleToInt64Bits(parsedIeee), 16);
                    while (parsedIeeeHex.Length < 16)
                    {
                        parsedIeeeHex = '0' + parsedIeeeHex;
                    }
                    bool roundTripOk = parsedIeee.ToString("G17").Equals(origIeee.ToString("G17"));
                    if (origIeee != parsedIeee || !roundTripOk)
                    {
                        Console.WriteLine("Number2Parse={0,-24:S} C#={1,-24:S} Original=" + origIeeeHex + 
                                          " Parsed=" + parsedIeeeHex +
                                          " Roundtrip OK=" +roundTripOk, number2Parse, origIeee.ToString("G17"));
                    }
                }
            }
        }
    }
}

@tannergooding tannergooding self-assigned this Sep 11, 2018
@cyberphone
Copy link
Author

Hi Guys,
Since this seems to (finally) be headed for IETF standardization
https://tools.ietf.org/html/draft-rundgren-json-canonicalization-scheme-02
it would be interesting knowing when the fix reaches W10 in an .NET update because even if I do a local fix in my software, the standard JSON tools will have this problem, making signature validation fail

@danmoseley
Copy link
Member

@cyberphone the fix is in master which means it will be in.NET Core 3.0. Another preview will be out in not long. There are no plans to port to.NET Framework, I think.

@cyberphone
Copy link
Author

@danmosemsft thanx!

@cyberphone
Copy link
Author

And it is working as well. SUPER!

@danmoseley
Copy link
Member

That is great @cyberphone. Any interest in making a contribution of your own perhaps? Lots of up for grabs labeled issues. :)

@cyberphone
Copy link
Author

@danmosemsft Well, there are contributions and contributions...
https://github.com/cyberphone/json-canonicalization/tree/master/dotnet#json-canonicalizer-for-net
It contains among many things a 2000-line ES6-compatible number serializer. Anyway thanx again, it was a great relief for this project as a whole.

@dotnet dotnet deleted a comment from cyberphone Nov 19, 2019
@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@msftgits msftgits added this to the Future milestone Jan 31, 2020
@tannergooding tannergooding removed their assignment May 26, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 17, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants