Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix StreamReader EOF handling and improve perf #69888

Merged
merged 3 commits into from
Nov 7, 2022

Conversation

GrabYourPitchforks
Copy link
Member

This fixes a bug in StreamReader where it doesn't properly pass flush: true to the backing Decoder instance when EOF is reached. This could result in cases where partial data in the decoder buffer is never processed.

For a concrete example, consider the below sample.

using System.IO;
using System.Text;

var reader = new StreamReader(
    new MemoryStream(new byte[] { 0xF0 }),
    new UTF8Encoding(
        encoderShouldEmitUTF8Identifier: false,
        throwOnInvalidBytes: true));
Console.WriteLine(reader.ReadToEnd());

This prints an empty string to the console, even though it should throw DecoderFallbackException since the caller explicitly requested that they want to reject invalid data.

I also took this opportunity to tighten the ReadLine and ReadLineAsync methods, including using StringBuilderCache. This shouldn't blow the cache because we expect lines to be 80 - 100 chars at the high end, which is well within what StringBuilderCache can handle.

This results in an approx. 50% throughput increase in ReadLine and ReadLineAsync, as shown in the below benchmarks. The benchmarks also show a decrease in overall StringBuilder allocations.

Method Job Toolchain Mean Error StdDev Ratio RatioSD Gen 0 Allocated
GetLineCount Job-QTEUMB main 404.5 μs 4.27 μs 4.19 μs 1.00 0.00 105.4688 431 KB
GetLineCount Job-OSITYF sr 263.5 μs 5.15 μs 7.70 μs 0.66 0.02 93.7500 384 KB
GetLineCountAsync Job-QTEUMB main 645.5 μs 12.58 μs 11.15 μs 1.00 0.00 169.9219 694 KB
GetLineCountAsync Job-OSITYF sr 421.6 μs 8.35 μs 12.24 μs 0.66 0.02 158.2031 647 KB
// In the below benchmarks, _ms is a MemoryStream whose contents have been initialized from:
// https://www.gutenberg.org/files/11/11-0.txt

[Benchmark]
public int GetLineCount()
{
    _ms.Position = 0;
    StreamReader reader = new StreamReader(_ms);

    int lineCount = 0;
    while (reader.ReadLine() != null) { lineCount++; }
    return lineCount;
}

[Benchmark]
public int GetLineCountAsync()
{
    _ms.Position = 0;
    StreamReader reader = new StreamReader(_ms);

    int lineCount = 0;
    while (reader.ReadLineAsync().GetAwaiter().GetResult() != null) { lineCount++; }
    return lineCount;
}

/cc @dotnet/area-system-io

@ghost
Copy link

ghost commented May 27, 2022

Tagging subscribers to this area: @dotnet/area-system-io
See info in area-owners.md if you want to be subscribed.

Issue Details

This fixes a bug in StreamReader where it doesn't properly pass flush: true to the backing Decoder instance when EOF is reached. This could result in cases where partial data in the decoder buffer is never processed.

For a concrete example, consider the below sample.

using System.IO;
using System.Text;

var reader = new StreamReader(
    new MemoryStream(new byte[] { 0xF0 }),
    new UTF8Encoding(
        encoderShouldEmitUTF8Identifier: false,
        throwOnInvalidBytes: true));
Console.WriteLine(reader.ReadToEnd());

This prints an empty string to the console, even though it should throw DecoderFallbackException since the caller explicitly requested that they want to reject invalid data.

I also took this opportunity to tighten the ReadLine and ReadLineAsync methods, including using StringBuilderCache. This shouldn't blow the cache because we expect lines to be 80 - 100 chars at the high end, which is well within what StringBuilderCache can handle.

This results in an approx. 50% throughput increase in ReadLine and ReadLineAsync, as shown in the below benchmarks. The benchmarks also show a decrease in overall StringBuilder allocations.

Method Job Toolchain Mean Error StdDev Ratio RatioSD Gen 0 Allocated
GetLineCount Job-QTEUMB main 404.5 μs 4.27 μs 4.19 μs 1.00 0.00 105.4688 431 KB
GetLineCount Job-OSITYF sr 263.5 μs 5.15 μs 7.70 μs 0.66 0.02 93.7500 384 KB
GetLineCountAsync Job-QTEUMB main 645.5 μs 12.58 μs 11.15 μs 1.00 0.00 169.9219 694 KB
GetLineCountAsync Job-OSITYF sr 421.6 μs 8.35 μs 12.24 μs 0.66 0.02 158.2031 647 KB
// In the below benchmarks, _ms is a MemoryStream whose contents have been initialized from:
// https://www.gutenberg.org/files/11/11-0.txt

[Benchmark]
public int GetLineCount()
{
    _ms.Position = 0;
    StreamReader reader = new StreamReader(_ms);

    int lineCount = 0;
    while (reader.ReadLine() != null) { lineCount++; }
    return lineCount;
}

[Benchmark]
public int GetLineCountAsync()
{
    _ms.Position = 0;
    StreamReader reader = new StreamReader(_ms);

    int lineCount = 0;
    while (reader.ReadLineAsync().GetAwaiter().GetResult() != null) { lineCount++; }
    return lineCount;
}

/cc @dotnet/area-system-io

Author: GrabYourPitchforks
Assignees: -
Labels:

area-System.IO

Milestone: 7.0.0

@danmoseley
Copy link
Member

danmoseley commented Jul 25, 2022

@GrabYourPitchforks do you expect to have time to continue this PR? I see that #62552 by @Trapov. Is much work remaining?

@stephentoub how important is it that we get these PR's into .NET 7?

@stephentoub
Copy link
Member

It'd be valuable to get this PR for .NET 7, but we wouldn't block the release on it 😄

Copy link
Member

@jozkee jozkee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I will close-reopen it to retrigger CI.

@jozkee jozkee closed this Aug 29, 2022
@jozkee jozkee reopened this Aug 29, 2022
@jozkee
Copy link
Member

jozkee commented Aug 30, 2022

CI issues are related:

...
  Discovering: System.IO.Tests (method display = ClassAndMethod, method display options = None)
  Discovered:  System.IO.Tests (found 719 of 727 test cases)
  Starting:    System.IO.Tests (parallel test collections = on, max threads = 4)
Process terminated. Assertion failed.
   at System.IO.StreamReader.ReadAsyncInternal(Memory`1 buffer, CancellationToken cancellationToken) in /_/src/libraries/System.Private.CoreLib/src/System/IO/StreamReader.cs:line 1215
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine) in /_/src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncMethodBuilderCore.cs:line 38
   at System.IO.StreamReader.ReadAsyncInternal(Memory`1 buffer, CancellationToken cancellationToken)
   at System.IO.StreamReader.ReadAsync(Memory`1 buffer, CancellationToken cancellationToken) in /_/src/libraries/System.Private.CoreLib/src/System/IO/StreamReader.cs:line 1080
   at System.IO.Tests.StreamReaderTests.ReadAsync_LongStreamIntoShortBuffer_PerformsFinalFlushCorrectly() in /_/src/libraries/System.IO/tests/StreamReader/StreamReaderTests.cs:line 819
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine) in /_/src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncMethodBuilderCore.cs:line 38
   at System.IO.Tests.StreamReaderTests.ReadAsync_LongStreamIntoShortBuffer_PerformsFinalFlushCorrectly()
   at System.RuntimeMethodHandle.InvokeMethod(Object target, Void** arguments, Signature sig, Boolean isConstructor)
...

@stephentoub
Copy link
Member

@GrabYourPitchforks, I'm planning to take over this PR. Let me know if you'd prefer I not. Thanks.

@adamsitnik adamsitnik modified the milestones: 7.0.0, 8.0.0 Nov 7, 2022
Copy link
Member

@adamsitnik adamsitnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you both @GrabYourPitchforks and @stephentoub !

@stephentoub is there any chance you could run these benchmarks and share the results before we hit the merge button?

Buffer.BlockCopy(_byteBuffer, n, _byteBuffer, 0, _byteLen - n);
byte[] byteBuffer = _byteBuffer;
_ = byteBuffer.Length; // allow JIT to prove object is not null
new ReadOnlySpan<byte>(byteBuffer, n, _byteLen - n).CopyTo(byteBuffer);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not related to the changes you have made: I can see that once the encoding is detected, this method is being called:

// Big Endian Unicode
_encoding = Encoding.BigEndianUnicode;
CompressBuffer(2);

So for the default buffer size and Unicode:

private const int DefaultBufferSize = 1024; // Byte buffer size

we copy 1022 bytes in place.

Why don't we just update _byteLen and _bytePos? I know it would require changing some other parts of the code that rely on the assumption that _bytePos == 0 after the read:

_charLen = _decoder.GetChars(_byteBuffer, 0, _byteLen, _charBuffer, 0, flush: false);

I am asking this question because this PR improves the performance of DetectEncoding by moving rarely called code out of hot path and it seems like another perf improvement we could make here while we are at it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub just checking you saw this comment

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did. It was preexisting this PR and my goal was just to land this PR. Anyone is welcome to follow up on the comment.

@stephentoub
Copy link
Member

is there any chance you could run these benchmarks and share the results before we hit the merge button?

Method Job Toolchain LineLengthRange Mean Error StdDev Median Min Max Ratio RatioSD Gen0 Gen1 Allocated Alloc Ratio
ReadLine Job-ICUQWW \main\corerun.exe [ 0, 0] 54.007 us 0.5088 us 0.4510 us 53.872 us 53.521 us 55.055 us 1.00 0.00 0.4281 - 3.27 KB 1.00
ReadLine Job-TBQMKP \pr\corerun.exe [ 0, 0] 116.675 us 2.2660 us 2.3270 us 115.754 us 113.742 us 121.385 us 2.17 0.04 0.4596 - 3.27 KB 1.00
ReadLine Job-ICUQWW \main\corerun.exe [ 0, 1024] 25.894 us 0.1901 us 0.1484 us 25.897 us 25.663 us 26.143 us 1.00 0.00 9.9338 0.1035 61.3 KB 1.00
ReadLine Job-TBQMKP \pr\corerun.exe [ 0, 1024] 5.566 us 0.2926 us 0.3252 us 5.510 us 5.135 us 6.442 us 0.22 0.01 5.8709 0.0650 35.98 KB 0.59
ReadLine Job-ICUQWW \main\corerun.exe [ 1, 1] 75.675 us 1.5807 us 1.7570 us 75.390 us 73.353 us 79.385 us 1.00 0.00 21.6121 - 132.69 KB 1.00
ReadLine Job-TBQMKP \pr\corerun.exe [ 1, 1] 112.312 us 1.4151 us 1.2545 us 112.014 us 110.407 us 114.133 us 1.48 0.04 21.2766 - 131.28 KB 0.99
ReadLine Job-ICUQWW \main\corerun.exe [ 1, 8] 70.865 us 1.3965 us 1.4341 us 71.037 us 68.668 us 73.844 us 1.00 0.00 15.3460 - 94.88 KB 1.00
ReadLine Job-TBQMKP \pr\corerun.exe [ 1, 8] 72.269 us 1.1682 us 1.0928 us 72.799 us 70.873 us 73.868 us 1.02 0.03 14.9212 - 92.05 KB 0.97
ReadLine Job-ICUQWW \main\corerun.exe [ 9, 32] 36.804 us 0.8704 us 0.8938 us 36.765 us 35.670 us 39.108 us 1.00 0.00 8.7243 - 54.08 KB 1.00
ReadLine Job-TBQMKP \pr\corerun.exe [ 9, 32] 22.755 us 0.3234 us 0.3025 us 22.701 us 22.397 us 23.244 us 0.62 0.02 8.2014 0.0943 50.4 KB 0.93
ReadLine Job-ICUQWW \main\corerun.exe [ 33, 128] 28.526 us 0.5235 us 0.4641 us 28.618 us 27.673 us 29.073 us 1.00 0.00 7.2415 - 44.88 KB 1.00
ReadLine Job-TBQMKP \pr\corerun.exe [ 33, 128] 8.806 us 0.2594 us 0.2987 us 8.739 us 8.367 us 9.461 us 0.31 0.01 6.4282 0.0699 39.43 KB 0.88
ReadLine Job-ICUQWW \main\corerun.exe [ 129, 1024] 27.006 us 0.6048 us 0.6723 us 26.859 us 26.141 us 28.397 us 1.00 0.00 9.9338 0.1035 61.08 KB 1.00
ReadLine Job-TBQMKP \pr\corerun.exe [ 129, 1024] 5.799 us 0.3215 us 0.3702 us 5.786 us 5.207 us 6.538 us 0.22 0.01 5.9695 0.0673 36.68 KB 0.60
ReadLine Job-ICUQWW \main\corerun.exe [1025, 2048] 29.657 us 0.7368 us 0.8486 us 29.414 us 28.271 us 31.111 us 1.00 0.00 14.7665 0.4579 90.71 KB 1.00
ReadLine Job-TBQMKP \pr\corerun.exe [1025, 2048] 5.721 us 0.1653 us 0.1838 us 5.682 us 5.476 us 6.057 us 0.19 0.01 6.0626 0.0694 37.24 KB 0.41
ReadLineAsync Job-ICUQWW \main\corerun.exe [ 0, 0] 280.831 us 5.0815 us 4.5046 us 283.446 us 274.084 us 285.435 us 1.00 0.00 94.3182 1.1364 579.27 KB 1.00
ReadLineAsync Job-TBQMKP \pr\corerun.exe [ 0, 0] 323.730 us 3.8763 us 3.6259 us 325.144 us 316.879 us 328.661 us 1.15 0.03 93.7500 - 579.27 KB 1.00
ReadLineAsync Job-ICUQWW \main\corerun.exe [ 0, 1024] 16.556 us 0.3721 us 0.3982 us 16.486 us 16.131 us 17.416 us 1.00 0.00 10.3261 0.2038 63.63 KB 1.00
ReadLineAsync Job-TBQMKP \pr\corerun.exe [ 0, 1024] 7.596 us 0.2903 us 0.3343 us 7.596 us 7.129 us 8.213 us 0.46 0.02 6.2294 0.1175 38.3 KB 0.60
ReadLineAsync Job-ICUQWW \main\corerun.exe [ 1, 1] 221.409 us 3.1384 us 2.6207 us 220.859 us 217.106 us 226.193 us 1.00 0.00 83.9041 0.8562 516.73 KB 1.00
ReadLineAsync Job-TBQMKP \pr\corerun.exe [ 1, 1] 248.871 us 2.3755 us 2.2220 us 248.263 us 245.998 us 253.694 us 1.12 0.02 83.3333 0.9921 515.33 KB 1.00
ReadLineAsync Job-ICUQWW \main\corerun.exe [ 1, 8] 139.495 us 3.7694 us 4.1897 us 137.139 us 136.879 us 148.902 us 1.00 0.00 47.0133 0.5531 288.66 KB 1.00
ReadLineAsync Job-TBQMKP \pr\corerun.exe [ 1, 8] 150.359 us 0.5854 us 0.5190 us 150.338 us 149.494 us 151.258 us 1.08 0.03 46.4286 0.5952 285.83 KB 0.99
ReadLineAsync Job-ICUQWW \main\corerun.exe [ 9, 32] 47.786 us 0.7042 us 0.6243 us 47.467 us 47.298 us 49.212 us 1.00 0.00 17.0973 0.1900 105.55 KB 1.00
ReadLineAsync Job-TBQMKP \pr\corerun.exe [ 9, 32] 43.736 us 0.6887 us 0.7370 us 43.547 us 42.842 us 44.999 us 0.92 0.02 16.6185 0.1806 101.87 KB 0.97
ReadLineAsync Job-ICUQWW \main\corerun.exe [ 33, 128] 24.341 us 0.4823 us 0.4953 us 24.217 us 23.778 us 25.317 us 1.00 0.00 9.6451 0.0965 59.15 KB 1.00
ReadLineAsync Job-TBQMKP \pr\corerun.exe [ 33, 128] 16.727 us 0.1437 us 0.1274 us 16.687 us 16.596 us 17.023 us 0.69 0.02 8.7162 0.0676 53.7 KB 0.91
ReadLineAsync Job-ICUQWW \main\corerun.exe [ 129, 1024] 16.183 us 0.2827 us 0.2903 us 16.156 us 15.764 us 16.829 us 1.00 0.00 10.2170 0.1892 62.98 KB 1.00
ReadLineAsync Job-TBQMKP \pr\corerun.exe [ 129, 1024] 7.070 us 0.0953 us 0.0891 us 7.040 us 6.949 us 7.232 us 0.44 0.01 6.2838 0.0845 38.58 KB 0.61
ReadLineAsync Job-ICUQWW \main\corerun.exe [1025, 2048] 17.619 us 0.3329 us 0.2951 us 17.617 us 17.201 us 18.306 us 1.00 0.00 14.8942 0.5568 91.48 KB 1.00
ReadLineAsync Job-TBQMKP \pr\corerun.exe [1025, 2048] 7.043 us 0.1396 us 0.1371 us 7.045 us 6.737 us 7.275 us 0.40 0.01 6.2045 0.1138 38.02 KB 0.42

@EgorBo
Copy link
Member

EgorBo commented Nov 10, 2022

@kunalspathak
Copy link
Member

kunalspathak commented Nov 12, 2022

Windows arm64 improvements: dotnet/perf-autofiling-issues#9689

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants