[release/9.0] Ensure that integer parsing correctly handles non-zero fractional data #106700
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport of #106506 to release/9.0
/cc @tannergooding
Customer Impact
Reported via #94971
Consumers of these parsing APIs for any built-in integer type and using
NumberStyles.Number
may see invalid inputs parsed without exception when they should have instead thrown anOverflowException
or returnedfalse
from, theTryParse
methods.byte
,sbyte
,short
,ushort
,int
,uint
,long
,ulong
,nint
,nuint
,Int128
, andUInt128
and all use this one shared generic code pathBecause the buffer size and positioning of non-zero trailing digits both have an effect on the broken logic, customers currently experience inconsistent behavior from these APIs for differerent values. Some invalid strings are parsed with success (erroneously) while others are rejected expectedly. With this fix, we return to a consistent behavior regardless of the positioning of the non-zero trailing digits relative to the string length and buffer size for the target type.
Regression
When using the
NumberStyles.Number
parsing option, .NET allows for integer types to parse strings that contain fractional data provided that fractional data is all zero. For example,1.0
is valid, as is1.00000
, but1.01
is invalid.In .NET 8, part of the parsing logic was refactored to allow better sharing of the logic across multiple types and so that UTF-8 and UTF-16 could be supported via the same algorithm without needing to duplicate several thousand lines of code. As part of this, an edge case was missed where an input string that had non-zero fractional data would not be reported as an error if said invalid fractional data appeared after the end of the
NumberBuffer
.To elaborate,
Int32
has a maximum number of 10 significant digit that can be represented, as such we allocate an 11 digit buffer to represent this and the null terminator. We parse the input string and track any significant digits in this buffer. Any significant digits beyond the end of this buffer continue scaling the exponent, allowing us to catch numbers that are "too large". We also track a propertyHasNonZeroTail
that allows us to determine if any fractional data existed that was not0
. Doing this keeps the memory footprint low, while improving performance.The issue was then that the refactoring failed to check
number.HasNonZeroTail
for the integer case. What this meant is that we would continue failing if there were fewer than 10 significant digits and non-zero fractional data. For example,123.4
would fail because that has a total of 4 significant digits. However,123456789.01
would incorrectly succeed because the first 10 significant digits (123456789.0
) were valid, being whole integer and a fractional portion that is zero. The invalid data appeared after the end of the explicitly tracked buffer.Testing
Explicit tests covering various edge cases for all the built-in integer types were added. This ensures that various inputs that are both within and outside the buffer range, particularly in relation to the respective Min/MaxValue for a given type are covered.
Risk
Medium.
The actual fix here is relatively simple, well understood, and directly impacts a less common code path (parsing using an explicit number style to allow integer inputs to specify fractional data that is
0
).However, this does touch core shared code that most applications in existence use (parsing for the primitive integer types such as
int
orlong
) and which has been known to be somewhat prone to user error in the face of globalization due to some cultures using.
as the decimal separator and,
as the number group separator; while others inverse this and use,
as the decimal separator. The fix remains correct and does the right thing in the face of such culture specific differences. However, due to the semi-regular user error around this space it has a higher chance to cause friction when the patch does go out and is worth taking into consideration as part of the risk to benefit analysis.