Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueType.INT metric instrument integer overflow #3014

Open
legendecas opened this issue Jun 2, 2022 · 22 comments
Open

ValueType.INT metric instrument integer overflow #3014

legendecas opened this issue Jun 2, 2022 · 22 comments
Assignees
Labels
feature-request never-stale sdk:metrics Issues and PRs related to the Metrics SDK

Comments

@legendecas
Copy link
Member

JavaScript integer numbers are coerced to floating-point values implicitly if the value goes over the Number.MAX_SAFE_INTEGER. A metric instrument with ValueType.INT should stick to integer values if possible.

Spec: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/supplementary-guidelines.md#integer

@legendecas legendecas added this to the Metrics GA milestone Jun 2, 2022
@legendecas legendecas changed the title INT metric instrument integer overflow ValueType.INT metric instrument integer overflow Jun 2, 2022
@dyladan dyladan added the up-for-grabs Good for taking. Extra help will be provided by maintainers label Jun 8, 2022
@dyladan
Copy link
Member

dyladan commented Jul 25, 2022

JS can only safely represent numbers between -(2^(53) - 1) and 2^(53) - 1. Outside this range, adding and subtracting integers will have an imprecise effect on the value. Small additions and subtractions may even have no effect at all.

Options:

  1. Overflow at 2^53 to -2^53 - 1 (stay within JS safe range)
  2. Reset at 2^53 to zero
  3. Overflow at 2^31 to -(2^31 - 1) (emulate 32-bit signed int)

The proto defines numeric integer metric values as signed 64-bit ints (sfixed64 in proto3). I think option 1 is most likely to be incorrectly handled by a backend (may be assumed to be a 64-bit int once it goes above 2^53 - 1). Option 2 is likely to be detected by most backends as a reset and I think should be pretty safe. Option 3 may be correctly detected as an overflow by some backends if they have overflow detection support and may be detected as a reset by others. I'm not sure which backends would properly handle a reset to a negative number.

Personally I think option (2) is the safest, but (3) is also reasonable.

@dyladan
Copy link
Member

dyladan commented Jul 25, 2022

We can also use BigInt which is supported in major browsers and node since 10.5

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/BigInt
https://node.green/#ES2020-features-BigInt
https://caniuse.com/bigint

When converting to proto we would need to be sure to properly overflow numbers larger than 2^64 - 1 using BigInt.asIntN(64, value).

@vmarchaud
Copy link
Member

We can also use BigInt which is supported in major browsers and node since 10.5

I'm not sure if its related but we had an issue to support int64 here: #104, maybe can we tackle both a the same time ? Just serializing it in proto differently depending on the value ?

@dyladan
Copy link
Member

dyladan commented Jul 28, 2022

We can also use BigInt which is supported in major browsers and node since 10.5

I'm not sure if its related but we had an issue to support int64 here: #104, maybe can we tackle both a the same time ? Just serializing it in proto differently depending on the value ?

Now that we don't have 8,10,12 we can use BigInt there too

@dyladan
Copy link
Member

dyladan commented Jul 28, 2022

How do we want to handle this? I don't want to just use BigInt for everything since numbers larger than 2^53 aren't the most common and there is a significant performance hit vs using number because BigInt can't take advantage of built in hardware arithmetic.

Here is my proposal:

/cc @open-telemetry/javascript-maintainers especially @legendecas

Int Gauge

Accept number or BigInt in the API. Do not do any conversions, and exporters should handle both cases. No arithmetic is needed in gauges and the user should expect to get out the precision they put in.

Int Sum

Accept number or BigInt in the API. If the user provides BigInt store internal representation as BigInt. If the user provides number then continue to use number until reaching 2^53 and convert to BigInt. Once the internal representation has been converted to BigInt never convert back to number. Above 2^63 (maximum signed 64-bit int) wrap to -2^63.

Histograms and float gauge/sum

No need for special handling as the API for histograms specifies float input. Bucket counts should never reach above MAX_SAFE_INTEGER.

@dyladan
Copy link
Member

dyladan commented Jul 28, 2022

Once we agree on a strategy i'm happy to work on this one

@dyladan dyladan self-assigned this Jul 28, 2022
@legendecas
Copy link
Member Author

I would be inclined not to introduce implicit conversion between numbers and bigints. Conversion from numbers to bigints is easy, but the interoperability between numbers and bigints is not trivial as all the operands must be a bigint. Lots of places must be aware of this implication and it can complicate the common path:

I'd suggest introducing a series of APIs explicitly involved with the bigints, like meter.createBigIntCounter.

@legendecas
Copy link
Member Author

The proto defines numeric integer metric values as signed 64-bit ints (sfixed64 in proto3). I think option 1 is most likely to be incorrectly handled by a backend (may be assumed to be a 64-bit int once it goes above 2^53 - 1).

@dyladan Would you mind clarifying why backends that support signed 64-bit ints can not handle resets from 2^53 - 1 to -(2^53 - 1)?

Either way, as the overflow has to be manually reset, resetting 2^53 - 1 to -(2^53 - 1) or 0 both sound good to me. Overflowing to the lower bound of the safe integer range might be a common choice for the other language implementations.

@vmarchaud
Copy link
Member

I'd suggest introducing a series of APIs explicitly involved with the bigints, like meter.createBigIntCounter.

Generally other languages have dedicated method for each type so i would be in favor to have different methods too

@dyladan
Copy link
Member

dyladan commented Jul 29, 2022

The proto defines numeric integer metric values as signed 64-bit ints (sfixed64 in proto3). I think option 1 is most likely to be incorrectly handled by a backend (may be assumed to be a 64-bit int once it goes above 2^53 - 1).

@dyladan Would you mind clarifying why backends that support signed 64-bit ints can not handle resets from 2^53 - 1 to -(2^53 - 1)?

It may be interpreted as a 64 bit overflow not a 53 bit overflow as 53 bits is a weird time for integers to overflow.

I'd suggest introducing a series of APIs explicitly involved with the bigints, like meter.createBigIntCounter.

I like this idea. We actually don't need to overflow the regular counter at all if we do this because we can simply document that any int instrument that goes larger than Number.MAX_SAFE_INTEGER should use the bigint version

@dyladan
Copy link
Member

dyladan commented Aug 2, 2022

Would it be better to have a separate method or to have a ValueType.BIGINT?

Option 1

meter.createBigIntCounter("my_big_int_counter", { description, unit });

Option 2

meter.createCounter("my_big_int_counter", { description, unit, valueType: ValueType.BIG_INT });

@vmarchaud
Copy link
Member

I wonder which one would be preferable for for platform that doesn't support bigint ? specially old browser version or edge platform that use old engines ?

I personally prefer option 1 though

@dyladan
Copy link
Member

dyladan commented Aug 3, 2022

I wonder which one would be preferable for for platform that doesn't support bigint

I guess it depends what you want your fallback to look like. Would option 1 fail or return a regular int counter on platforms with no BigInt? If it returns a regular int counter then we are back to the complexity of an API that has to support multiple input types in order to support the fallback behavior.

For both options, we need to decide if we're going to allow BigInt as an input, or just use it as the internal representation. For me, option 1 means we have to allow it as an input. Option 2 keeps the same number-based API, but just changes the internal representation which makes the fallback easy (just use number). Using the number-based API, you would not be able to safely do a single add of more than 2^53, but that seems like a minor issue.

@vmarchaud
Copy link
Member

I guess it depends what you want your fallback to look like. Would option 1 fail or return a regular int counter on platforms with no BigInt? If it returns a regular int counter then we are back to the complexity of an API that has to support multiple input types in order to support the fallback behavior.

I would have suggested to make new functions for bigint that are optional in the API but its not great as UX

@legendecas
Copy link
Member Author

Would option 1 fail or return a regular int counter on platforms with no BigInt?

I would prefer a more explicit and fail-early approach for feature absence.

For both options, we need to decide if we're going to allow BigInt as an input, or just use it as the internal representation. For me, option 1 means we have to allow it as an input.

For both options, my idea is that their internal representation is what their name/flag suggests. I don't have a strong opinion on allowing numbers on bigint counters yet, but I'd find a naive start with no implicit conversion would be great.

@dyladan
Copy link
Member

dyladan commented Aug 3, 2022

Would option 1 fail or return a regular int counter on platforms with no BigInt?

I would prefer a more explicit and fail-early approach for feature absence.

So just use BigInt and it might throw. User is responsible for doing any feature detection?

@legendecas
Copy link
Member Author

legendecas commented Aug 3, 2022

Would option 1 fail or return a regular int counter on platforms with no BigInt?

I would prefer a more explicit and fail-early approach for feature absence.

So just use BigInt and it might throw. User is responsible for doing any feature detection?

Yeah, as we don't provide any polyfills, I don't think we can do many things without BigInt support.

@aabmass
Copy link
Member

aabmass commented Aug 3, 2022

I did a little digging because I was curious how you would pass BigInt to the protocol buffers implementation. It looks like we are using protobuf.js implementation and the typescript stubs it generates look like this:

/** Properties of a NumberDataPoint. */
interface INumberDataPoint {
    // ....
    /** NumberDataPoint asDouble */
    asDouble?: (number|null);

    /** NumberDataPoint asInt */
    asInt?: (number|Long|null);
    // ....
}

the asInt variant accepts Long, which after doing a little digging seems to be from long.js, which is being pulled in as one of protobuf.js dependencies anyway.

  1. Am I missing some way that protobuf.js already accepts BigInts? I couldn't verify if you can just pass a string. For OTLP JSON, I know you should use a decimal string.
  2. I'm wondering if we've considered using Long instead of BigInt? Since we are limited to 64 bits in OTLP, we don't really need BigInt beyond that. Not sure how these two compare performance wise, but having BigInt as a builtin is certainly a pro. OTOH, long.js will be pulled in for anyone using the OTLP protobuf exporter anyway.

@dyladan
Copy link
Member

dyladan commented Aug 4, 2022

which after doing a little digging seems to be from long.js

Yes that is what that is from.

, which is being pulled in as one of protobuf.js dependencies anyway.

They actually aren't bundling long.js. You can look at the bundler and see that it is external which means it isn't included. I was avoiding using long.js because I didn't want to include another dependency (and not a particularly small one).

  1. Am I missing some way that protobuf.js already accepts BigInts? I couldn't verify if you can just pass a string. For OTLP JSON, I know you should use a decimal string.

You can pass a string but it is converted to a number (or Long if you have included that dependency) for internal representation. There is also a PR open to add BigInt support to protobuf.js protobufjs/protobuf.js#1557

  1. I'm wondering if we've considered using Long instead of BigInt? Since we are limited to 64 bits in OTLP, we don't really need BigInt beyond that. Not sure how these two compare performance wise, but having BigInt as a builtin is certainly a pro.

I did consider it, but decided that I would rather use a real builtin and wait for proper support. Using an external module for 64-bit support means we have to support it for a very long time when the JS ecosystem is clearly moving in another direction.

OTOH, long.js will be pulled in for anyone using the OTLP protobuf exporter anyway.

See above comments about bundler but this actually isn't true.

I was going to convert BigInts to string before providing them to the protobufjs and let protobufjs interpret them in whatever way it knows how. This would allow the end-user to decide if they wanted to use the Long.js package. For any user that didn't use Long.js, they would still get accurate internal representation within OTel with proper overflow, and just have a lossy output until protobufjs implements BigInt support. When protobufjs does update to support BigInts we would remove our string conversion.

@dyladan
Copy link
Member

dyladan commented Aug 4, 2022

Would option 1 fail or return a regular int counter on platforms with no BigInt?

I would prefer a more explicit and fail-early approach for feature absence.

So just use BigInt and it might throw. User is responsible for doing any feature detection?

Yeah, as we don't provide any polyfills, I don't think we can do many things without BigInt support.

We wouldn't really need to provide polyfills. We could just accept number | BigInt for the counter APIs and do a simple typeof inp === 'number' check. If that check fails we assume we have a BigInt. For our own internal representation we would do a feature detection and fall back on number if BigInt was not available.

@dyladan dyladan removed the up-for-grabs Good for taking. Extra help will be provided by maintainers label Aug 15, 2022
@dyladan dyladan modified the milestones: Metrics GA, Metrics After GA Aug 31, 2022
@legendecas legendecas added the sdk:metrics Issues and PRs related to the Metrics SDK label Sep 1, 2022
@github-actions
Copy link

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days.

@ryanleecode
Copy link

Whats the solution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request never-stale sdk:metrics Issues and PRs related to the Metrics SDK
Projects
None yet
Development

No branches or pull requests

6 participants