-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide a version of Stream.Read that reads as much as possible #16598
Comments
+1 for |
While .NET Stream.Read does have the behavior you mention, the WinRT IInputStream.ReadAsync method actually takes a enum parameter that can vary the behavior from "return as quick as possible" to "read as much as possible filling the buffer": https://msdn.microsoft.com/en-us/library/hh438388(v=vs.85).aspx InputStreamOptions:
So, this is an example of an API pattern that might be useful in some way for .NET. |
@davidsh Interesting. Though I think that having a guarantee that everything that could be read was read is more useful than what |
@svick though we can provide that guarantee to ourselves right now (and already have the greater guarantee of a given number of transcoded characters when going through a StreamReader), but not give the options @davidsh mentions. I'd rather gain that new ability than an extra wrapper method, if I could pick only one. |
We see two options: First one: public int ReadRequestedCount(byte[] buffer, int offset, int count);
public Task<int> ReadRequestedCountAsync(byte[] buffer, int offset, int count);
public Task<int> ReadRequestedCountAsync(byte[] buffer, int offset, int count, CancellationToken cancellationToken); Second one: enum StreamOptions
{
Partial = 0,
RequestedCount = 1
}
public virtual int Read(byte[] buffer, int offset, int count, StreamOptions options);
public Task<int> ReadAsync(byte[] buffer, int offset, int count, StreamOptions options);
public virtual Task<int> ReadAsync(byte[] buffer, int offset, int count, StreamOptions options, CancellationToken cancellationToken); The nice thing about the second one is that we can add more policy if we need to; Secondly, do we need to do anything for |
Stylistically I prefer the second of the two options, but I'm a bit of a sucker for enums.
I'd argue that our existing Write is already adequate because our default behavior is to write all bytes; if you want a partial write you could just decrease the count you pass to the function. The only way I can see a WritePartial function providing value is if you wanted to break up a Write of X bytes into X/Y smaller writes. That alone doesn't seem like a common enough scenario to warrant additional API (unlike the ReadAll function which I'm all +1 for). |
Triage: seems valuable, let's finish addressing the API review feedback. |
As per my issue where this issue was just mentioned - please also include |
Also, there is prior art with the name |
|
I think it's more complicated than just "partial" or "all". What I usually want is a read operation that will guarantee me at least N bytes (or indicate that EOF happened), but also give me additional available data (up to my buffer size of course). N here is often small -- e.g. for HTTP2 the frame header is 9 bytes; for TLS it's 5. Sometimes I may even have a part of the frame header buffered already, meaning N could be even smaller. But I still want to get as much data in a single Read as I can. So, something like this: public int ReadAtLeast(Span<byte> buffer, int bytesToRead); And similar for async. If you really want to just fill the buffer completely, pass buffer.Length for bytesToRead. Some questions to consider: (1) Virtual or not? I don't see any reason this needs to be virtual.
|
I think that should be something like the code below: public static void ReadAtLeast(this Stream stream, Span<byte> buffer)
{
ReadAtLeast(stream, buffer, buffer.Length);
}
public static void ReadAtLeast(this Stream stream, Span<byte> buffer, int bytesToRead)
{
int totalRead = 0;
while (totalRead < bytesToRead)
{
int read = stream.Read(buffer.Slice(totalRead));
if (read == 0)
throw new EndOfStreamException();
totalRead += read;
}
}
public static async ValueTask ReadAtLeastAsync(this Stream stream, Memory<byte> buffer, CancellationToken cancellationToken = default)
{
return ReadAtLeastAsync(stream, buffer, buffer.Length, cancellationToken);
}
public static async ValueTask ReadAtLeastAsync(this Stream stream, Memory<byte> buffer, int bytesToRead, CancellationToken cancellationToken = default)
{
int totalRead = 0;
while (totalRead < bytesToRead)
{
int read = await stream.ReadAsync(buffer.Slice(totalRead), cancellationToken).ConfigureAwait(false);
if (read == 0)
throw new EndOfStreamException();
totalRead += read;
}
} |
Can I require a revisit on this? The |
ReadAtLeast would align with the method we added to |
I've updated the top post to the current proposal and marked this |
After lots of discussion we arrived a this: namespace System.IO;
public partial class Stream
{
public void ReadExactly(byte[] buffer, int offset, int count);
public void ReadExactly(Span<byte> buffer);
public int ReadAtLeast(Span<byte> buffer, int minimumBytes, bool throwOnEndOfStream = true);
public ValueTask<int> ReadAtLeastAsync(Memory<byte> buffer, int minimumBytes, bool throwOnEndOfStream = true, CancellationToken cancellationToken = default);
public ValueTask ReadExactlyAsync(byte[] buffer, int offset, int count, CancellationToken cancellationToken = default);
public ValueTask ReadExactlyAsync(Memory<byte> buffer, CancellationToken cancellationToken = default);
} |
@eerhardt, I wonder if we should consider an analyzer/fixer here as well to go along with the new methods. Flag calls like: stream.Read(buffer, 0, buffer.Length); where the return value isn't checked, and recommend it be changed to a call to ReadExactly. Or cases even where the return value is checked but required to be the same as what was requested: if (stream.Read(buffer, 0, buffer.Length) != buffer.Length) throw ...; and similarly suggest it use ReadExactly. And flag calls like: stream.Read(buffer, 0, count); and recommend it be changed to: stream.ReadAtLeast(buffer, count); |
Is there a reason to add new byte[] overloads and not just prefer Memory<byte>? |
There's about an hour of discusson on that in the video. |
Ill try to get through it but IMO I'm not seeing the obvious reason for this in a .NET 7+ world. Stream is a monster already... Why are these non virtual? |
There is discussion on that in the video as well. But see also the top post of this issue. |
@stephentoub - I opened #69159 to suggest this analyzer. We can discuss it there. However, I wonder if addressing #34098 would be more beneficial than that analyzer. |
Will there be a way to know how many bytes have been read if If not, maybe the way to go is to create a new public class EndOfStreamException2 : EndOfStreamException
{
public int BytesRead { get; init; }
} And make the user catch this specific type. The other non unpleasant option is the position of the stream being reset (and this is so unconvenient and impossible in some streams). |
If you care about the number of bytes read in that case, you wouldn't use ReadExactly. Instead you'd use ReadAtLeast, e.g. int numRead = stream.ReadAtLeast(buffer, buffer.Length, false);
if (numRead < buffer.Length)
{
HandleEof(numRead);
} |
Gotcha! Thanks @stephentoub . |
Now I regret not being a part of this discussion ;) |
It's not too late to provide your feedback. We have until |
I don't think these methods can be considered pure? They change the state of the stream (at the very least the current position, potentially more in custom stream classes) so you couldn't call them repeatedly with the same outcome. |
That issue isn't about purity any more. It's about annotating methods for which the return value shouldn't be ignored. |
The title of #34098 should probably be updated to not reference "Pure" and the top post should be updated for the latest proposal. That issue has transitioned from "pure" to "do not ignore return value". See #34098 (comment). |
Adds methods to Stream to read at least a minimum amount of bytes, or a full buffer, of data from the stream. ReadAtLeast allows for the caller to specify whether an exception should be thrown or not on the end of the stream. Make use of the new methods where appropriate in net7.0. Fix dotnet#16598
Now I wonder if we should add ReadExactlyAsync to |
* Add Stream ReadAtLeast and ReadExactly Adds methods to Stream to read at least a minimum amount of bytes, or a full buffer, of data from the stream. ReadAtLeast allows for the caller to specify whether an exception should be thrown or not on the end of the stream. Make use of the new methods where appropriate in net7.0. Fix #16598 * Add ReadAtLeast and ReadExactly unit tests * Add XML docs to the new APIs * Preserve behavior in StreamReader.FillBuffer when passed 0. * Handle ReadExactly with an empty buffer, and ReadAtLeast with minimumBytes == 0. Both of these cases are a no-op. No exception is thrown. A read won't be issued to the underlying stream. They won't block until data is available.
Background and motivation
One of the most common mistakes when using
Stream.Read()
is that the programmer doesn't realize thatRead()
may return less data than what is available in theStream
and less data than the buffer being passed in. And even for programmers who are aware of this, having to write the same loop every single time they want to read from aStream
is annoying.With the .NET 6 breaking change: Partial and zero-byte reads in DeflateStream, GZipStream, and CryptoStream, it has become apparent that a
Stream.Read
API that ensures you get at leastn
bytes read is valuable.API Proposal
ReadAtLeast
will return the number of bytes read intobuffer
.throwOnEndOfStream == true
, the return value will always beminimumBytes <= bytesRead <= buffer.Length
. If the end of the stream is detected, anEndOfStreamException
will be thrown.throwOnEndOfStream == false
, the return value will always bebytesRead <= buffer.Length
. Callers can check for end of the stream by checkingbytesRead < minimumBytes
.API Usage
Example 1
Example 2
Alternative Designs
We could add a convenience wrapper (
ReadAll
orFill
) that doesn't takeint minimumBytes
, and usesbuffer.Length
as theminimumBytes
.buffer.Length
ReadAtLeast
andReadAll
) wouldn't match. I can't think of a decent common name that would work for both operations.There is a question of whether these methods should be
virtual
or not. From scanning the Stream implementations in dotnet/runtime, I don't see anything special a Stream could do forReadAtLeast
. They already get passed the buffer length, if they want a hint of how much data is being requested.ReadAtLeast
is inPipeReader.AsStream()
's implementation. SincePipeReader
has aReadAtLeastAsync
API, thePipeReaderStream
could overrideReadAtLeastAsync
and forward the call directly toPipeReader.ReadAtLeastAsync
. However, there is no synchronous API onPipeReader
, so it would just be implemented for the async API.Original Proposal
I believe one of the most common mistakes when using
Stream.Read()
is that the programmer doesn't realize thatRead()
may return less data than what is available in theStream
. And even for programmers who are aware of this, having to write the same loop every single time they want to read from aStream
is annoying.So, my proposal is to add the following methods to
Stream
(mirroring the existingRead
methods):Each
ReadAll
method would call the correspondingRead
method in a loop, until thebuffer
was filled or untilRead
returned 0.Questions:
ReadAll
?BinaryReader.ReadBytes()
. Why doesn't it haveasync
version?The text was updated successfully, but these errors were encountered: