Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigInt.prototype.toLocaleString #218

Closed
littledan opened this issue Feb 14, 2018 · 16 comments · Fixed by #236
Closed

BigInt.prototype.toLocaleString #218

littledan opened this issue Feb 14, 2018 · 16 comments · Fixed by #236
Labels
c: numbers Component: numbers, currency, units s: in progress Status: the issue has an active proposal

Comments

@littledan
Copy link
Member

In the Stage 3 BigInt proposal, toLocaleString is defined. We should have a definition for this in the ECMA-402 specification.

I wasn't able to find an ICU API which would be useful here; does one exist?

cc @jakobkummerow

@anba
Copy link
Contributor

anba commented Mar 6, 2018

ICU4C has unum_formatDecimal, ICU4J has NumberFormat#format(java.math.BigInteger).

@littledan
Copy link
Member Author

That looks perfect for this case. Seems like we could pass the output of ToString(bigInt) to unum_formatDecimal as its number/length arguments, if I'm understanding the Decimal Arithmetic Specification correctly.

@anba Would you be interested in writing specification text for BigInt.prototype.toLocaleString in ECMA 402? The design I was imagining would be to base it on an overload of Intl.NumberFormat.prototype.format for BigInts--Number Format Functions could start with ToNumeric rather than ToNumber, and then PartitionNumberPattern would somehow be either generalized or cloned to handle BigInt in addition to Number.

cc @cxielarko

@anba
Copy link
Contributor

anba commented Mar 15, 2018

AFAICT the descriptions for FormatNumberToString, PartitionNumberPattern, FormatNumber, ToRawPrecision, and ToRawFixed only need to be changed to allow BigInt values for the x parameter. BigInt.prototype.toLocaleString can then be defined as:

  1. Let x be ? thisBigIntValue(this value).
  2. Let numberFormat be ? Construct(%NumberFormat%, « locales, options »).
  3. Return FormatNumber(numberFormat, x).

And maybe s/Number/Numeric/ for FormatNumberToString, PartitionNumberPattern, and FormatNumber?

Do we also want to modify (or add a new method) to allow BigInt values for Intl.NumberFormat.prototype.format and Intl.NumberFormat.prototype.formatToParts? The latter may not be implementable in ICU4C because it lacks a unum_formatDecimalForFields function.

@sffc
Copy link
Contributor

sffc commented Mar 16, 2018

FYI, better than unum_formatDecimal and NumberFormat#format(java.math.BigInteger) is to use the new methods offered in icu.number and numberformatter.h. The former two methods are not deprecated (at least not yet), but they are discouraged for new users.

numberformatter.h has an endpoint for the same syntax as unum_formatDecimal, but it also has full support for the field position iterator needed for formatToParts. You use it like this:

FormattedNumber result = NumberFormatter::withLocale(...).formatDecimal(..., status);
result.populateFieldPositionIterator(...);
UnicodeString resultString = result.toString();

@anba
Copy link
Contributor

anba commented Mar 16, 2018

Thanks for the info that the new number formatting API will support this use case! For SpiderMonkey we may need to wait for https://ssl.icu-project.org/trac/ticket/13597, because we generally only use the C-API for its compatibility across different releases.

@sffc
Copy link
Contributor

sffc commented Apr 20, 2018

As a general comment, I think that we should interpret string inputs as the highest precision datatype supported. So if I give a string containing something like "987654321987654321987654321", that should be interpreted as a BigInteger. However, "9876.543" should be interpreted as a double until Ecma adds a datatype for BigDecimal, for example.

@littledan
Copy link
Member Author

I'd prefer not to do this sort of "magical" semantic overloading. If your intuition is that that should exist, then I'm inclined to add another method, and avoid overloading, e.g., Intl.NumberFormat.prototype.formatBigInt/formatBigIntToParts. What would you think of that?

@sffc
Copy link
Contributor

sffc commented Apr 22, 2018

I don't see this as "magical" overloading. I see it as the method taking the input data type and then internally choosing the highest precision data type available to represent it.

My intuition is that it's weird to ever convert strings to Numbers, and we should avoid it when possible with better alternative behavior.

@littledan
Copy link
Member Author

My intuition is that it's weird to ever convert strings to Numbers, and we should avoid it when possible with better alternative behavior.

Following this intuition, I'd like to mentally classify the format method accepting String arguments at all as weird random legacy behavior, and based on that analysis, not upgrade it to get higher precision. We can't remove it, but we also don't have to improve it. Leaving format(String)'s behavior unchanged will simplify the specification and implementation.

littledan added a commit to littledan/ecma402 that referenced this issue May 1, 2018
This patch brings Intl.NumberFormat support to BigInt, and
adds a BigInt.prototype.toLocaleString method based on it.

The design here is to include overloading between BigInt and Number
as arguments for the format and formatToParts methods based on
ToNumeric. This means that, for example, string arguments are
cast to Number, rather than BigInt. This design preserves
compatibility and consistency with operators like unary -

This definition permits options in the NumberFormat to force
decimal places, e.g., 1n formatting as 1.00000 if the minimum
fractional digits is 5. Alternative semantics would be to
throw an exception in this case.

For the algorithm text itself: the specification algorithms
ToRawPrecision and ToRawFixed are now used for both Numbers
and BigInts. Given the ECMAScript specification's use of implicit
coercisions between Numbers and mathematical values, I believe
that this is valid without any special changes; the phrasing
may change in the future [1].

ICU4C-based implementations of ECMAScript can use
LocalizedNumberFormatter::formatDecimal [2] or
unum_formatDecimal [3] to implement the algorithms in this patch.

[1] tc39/ecma262#1135
[2] http://icu-project.org/apiref/icu4c/classicu_1_1number_1_1LocalizedNumberFormatter.html#a29cd3d107b784496e19175ce0115f26f
[3] http://icu-project.org/apiref/icu4c/unum_8h.html#a59870a322f012dc1b9d99cf8a7b708f1

Closes tc39#218
@FrankYFTang
Copy link
Contributor

FrankYFTang commented Jan 19, 2019

Could someone show me a js example code of how to use BigInt with the toLocaleString?

IS it like

12345678901234567890n.toLocaleString("fr")

?
Is it adding a "n" after the digit good enough to denote the BigInt? Sorry of my lack of confidence of knowing BigInt well here.

@jakobkummerow
Copy link

Yes, that looks correct.

@FrankYFTang
Copy link
Contributor

Thanks for the code review from @jakobkummerow the v8 implementation sync with the current spec ( 040f809 ) is landed into v8 tree behind the flag --harmony-intl-bigint . You can also see the test I wrote here
https://chromium-review.googlesource.com/c/v8/v8/+/1424021/8/test/intl/bigint/tolocalestring.js
Notice in this implementation BigInt(-0).toLocaleString() will output "0" instead of "-0". @jakobkummerow said that is the expected behavior. In the other hand (-0).toLocaleString() will return "-0".

@littledan
Copy link
Member Author

Great work, @FrankYFTang !

That's right; for BigInt, -0n is 0n.

I'm going to present on this issue in the upcoming TC39 meeting, arguing we should instead make a separate method, formatBigInt, and consider that a pattern going forward. See slides.

@ray007
Copy link

ray007 commented Jan 25, 2019

I can understand there being a difference between -0 and 0 for double/float, but for integers?

@FrankYFTang
Copy link
Contributor

The reason my current v8 implementation output BigInt(-0) to "0" is because the BitInt.prototype.toString output "0" instead of "-0".

littledan added a commit to littledan/ecma402 that referenced this issue Jan 25, 2019
This patch brings Intl.NumberFormat support to BigInt, and
adds a BigInt.prototype.toLocaleString method based on it.

The design here is to include overloading between BigInt and Number
as arguments for the format and formatToParts methods based on
ToNumeric. This means that, for example, string arguments are
cast to Number, rather than BigInt. This design preserves
compatibility and consistency with operators like unary -

This definition permits options in the NumberFormat to force
decimal places, e.g., 1n formatting as 1.00000 if the minimum
fractional digits is 5. Alternative semantics would be to
throw an exception in this case.

For the algorithm text itself: the specification algorithms
ToRawPrecision and ToRawFixed are now used for both Numbers
and BigInts. Given the ECMAScript specification's use of implicit
coercisions between Numbers and mathematical values, I believe
that this is valid without any special changes; the phrasing
may change in the future [1].

ICU4C-based implementations of ECMAScript can use
LocalizedNumberFormatter::formatDecimal [2] or
unum_formatDecimal [3] to implement the algorithms in this patch.

[1] tc39/ecma262#1135
[2] http://icu-project.org/apiref/icu4c/classicu_1_1number_1_1LocalizedNumberFormatter.html#a29cd3d107b784496e19175ce0115f26f
[3] http://icu-project.org/apiref/icu4c/unum_8h.html#a59870a322f012dc1b9d99cf8a7b708f1

Closes tc39#218
@jakobkummerow
Copy link

And the reason that (-0n).toString() === "0" is because there is no negative-zero BigInt. It's not a question of what toString does. -0.0 is an IEEE floating-point concept; integers of any kind don't have it.

See our implementation of BigInt negation:

Handle<BigInt> BigInt::UnaryMinus(Isolate* isolate, Handle<BigInt> x) {
  // Special case: There is no -0n.
  if (x->is_zero()) {
    return x;
  }
  Handle<MutableBigInt> result = MutableBigInt::Copy(isolate, x);
  result->set_sign(!x->sign());
  return MutableBigInt::MakeImmutable(result);
}

@sffc sffc added c: numbers Component: numbers, currency, units s: in progress Status: the issue has an active proposal labels Mar 19, 2019
leobalter pushed a commit that referenced this issue Jun 13, 2019
This patch brings Intl.NumberFormat support to BigInt, and
adds a BigInt.prototype.toLocaleString method based on it.

The design here is to include overloading between BigInt and Number
as arguments for the format and formatToParts methods based on
ToNumeric. This means that, for example, string arguments are
cast to Number, rather than BigInt. This design preserves
compatibility and consistency with operators like unary -

This definition permits options in the NumberFormat to force
decimal places, e.g., 1n formatting as 1.00000 if the minimum
fractional digits is 5. Alternative semantics would be to
throw an exception in this case.

For the algorithm text itself: the specification algorithms
ToRawPrecision and ToRawFixed are now used for both Numbers
and BigInts. Given the ECMAScript specification's use of implicit
coercisions between Numbers and mathematical values, I believe
that this is valid without any special changes; the phrasing
may change in the future [1].

ICU4C-based implementations of ECMAScript can use
LocalizedNumberFormatter::formatDecimal [2] or
unum_formatDecimal [3] to implement the algorithms in this patch.

[1] tc39/ecma262#1135
[2] http://icu-project.org/apiref/icu4c/classicu_1_1number_1_1LocalizedNumberFormatter.html#a29cd3d107b784496e19175ce0115f26f
[3] http://icu-project.org/apiref/icu4c/unum_8h.html#a59870a322f012dc1b9d99cf8a7b708f1

Closes #218
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c: numbers Component: numbers, currency, units s: in progress Status: the issue has an active proposal
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants