-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API Proposal]: Support parsing server-sent events (SSE) #98105
Comments
Tagging subscribers to this area: @dotnet/ncl Issue DetailsBackground and motivationSSE is becoming more and more popular, especially with prominent services like OpenAI relying on it for streaming responses. The format is very simple, but it still takes some amount of code to properly handle parsing the SSE format. We should have a built-in helper in either System.Net or System.Formats that take care of it for the developer (optionally then other higher-level helpers could be layered on top). API Proposalnamespace System.Formats.Sse;
public struct SseItem<T>
{
public string? Event { get; set; }
public T Data { get; set; }
}
public static class SseParser
{
public static IAsyncEnumerable<SseItem<T>> ParseAsync<T>(Stream sseStream, Func<ReadOnlySpan<byte>, T> itemParser, CancellationToken cancellationToken = default);
} API UsageHttpClient client = ...;
using Stream responseStream = await client.GetStreamAsync(...);
await foreach (SseItem<string> item in SseParser.ParseAsync(responseStream, Encoding.Utf8.GetString))
{
Console.WriteLine(item.Data);
} Alternative Designs
Risks
|
Is the proposal here to have a separate assembly for this or place it in an existing assembly? The way this is designed it seems like you have a preference. Might be good to add some thoughts around why not for the alternatives. What do you think are next steps here? cc @dotnet/ncl @bartonjs |
Ideally I think it would be a separate assembly that just worked in terms of Stream and that had a netstandard2.0 asset, so that it could be used downlevel by AI-related components that need it.
|
First: it's very cool to see a proposal to standardize server-sent events in this manner. It'll be a huge help to have this for consumers like OpenAI, which leans heavily on the pattern for its streamed REST response payloads. With that, I'd suggest the API proposal be expanded to proactively cover the entirety of the spec (as outlined at https://html.spec.whatwg.org/multipage/server-sent-events.html#event-stream-interpretation): public struct SseItem<T>
{
public string EventName { get; set; } = "message"; // per spec, "message" is the implicit default and null is impossible
public T Data { get; set; }
public string LastEventId { get; set; } // "id" in spec
public TimeSpan ReconnectionInterval { get; set; } // "retry" in spec
} Notably, the parsing logic then needs to record and propagate
I haven't yet encountered use of the reconnection-related fields ( |
Any reason why the envelope type is not a readonly struct? |
It could be, then adding a constructor for the data. It's a simple POCO. |
I've implemented something close to some of the editions of this proposal for my own purposes and now sharing it here in case someone finds it useful. Note that the code hasn't been heavily tested, so it probably contains few bugs. Anyway I tried to implement it as close to the spec as I could (including event dispatch instructions for browsers) and to make it as efficient as I could (to a reasonable extent). Notes about incompatibilities with the proposal:
As a bonus you can find in the gist demo of streaming request to the OpenAI API, few tests (examples from the spec), and implementation of |
Thanks! I have an implementation ready to go once this is approved, but appreciate your sharing none-the-less. It's great to know that the general shape has worked well for you.
I think it's ok. The properties are possibly modified during a call to MoveNextAsync, and so you shouldn't be reading them during such a call, but that maps to normal concurrency rules around such objects. If someone just writes a normal foreach loop, they'll naturally fall into a pit of success in this regard.
I think they're effectively one in the same, as the spec details that the last event ID is retained and used for subsequent events that don't include their own ID: "Initialize event's type attribute to "message", its data attribute to data, its origin attribute to the serialization of the origin of the event stream's final URL (i.e., the URL after redirects), and its lastEventId attribute to the last event ID string of the event source" (in this regard the enumerable is effectively the equivalent of the EventSource). And by putting it on the enumerable rather than on each event, we streamline the event a bit and pass less data around. It's also expected that a consumer doesn't actually need the last event ID on each message; it's only necessary when a reconnect is desired. |
Yeah, probably you are right, there is no need to overthink this for some weird non-practical use-cases.
I assumed id may be used not only for reconnects, but also have some application level meaning, but I am not an expert in SSE, maybe id shouldn't be used this way. The second concern was about consistency, we should guarantee that the user has processed the event with |
@davidfowl, has anyone looked into what ASP.NET would want here? I'm imagining it would need some kind of SseFormatter for writing out events, which would be trivial, much more so than the parser, and that could be added later if desirable (might be easier for that part to just be in ASP.NET). I'm more interested in the SseItem struct here, as I'd hope that could be shared. It occurs to me that could influence the open question about whether there's a Retry and Id on the struct and whether they're optional, though as optional they could also be added later (with a new constructor). |
No nobody has looked, but what you said is right, though I don't know if we need it, the format is simple enough. I think we'd want to support returning a |
In the sample, always is returned a generic |
Then either a) you can make T a discriminated union, b) you can make T be object and have the consumer type test, c) you can just have T be byte[] containing the UTF8 bytes and transform it at the consumer (the delegate just calls ToArray), or d) you can have T be string and transform it at the consumer (the delegate just calls Encoding.UTF8.GetString). The proposed Parse overload that doesn't take a delegate does (d). |
Yeah
Is that the same as
I was thinking we'd want to support
I forgot signalr had an implementation. That'd be great. The plan here, and the draft PR, creates a nupkg which includes downlevel assets. |
I mean something that implemented the ASP.NET Core IResult, similar to @dlyz implemented https://gist.github.com/dlyz/1c2f892e482f599093bdb9021e20c26f#file-04_sseresult-cs-L12.
Yes, we would want to support this natively as well. One point of ambiguity is if the T should be JSON serialized in the ASP.NET Core case. We need to do something with it. If you don't want this behavior, is there a way to opt out? |
Looks like this should be fairly easy to adopt in client-side SignalR, our parser delegate would just be I do find it very odd that AI folks are investing in SSE though. WebSockets and Http response streaming are great for streaming data. They also both allow binary data, whereas SSE is text only. |
namespace System.Net.ServerSentEvents;
public readonly struct SseItem<T>
{
public SseItem(T data, string eventType);
public T Data { get; }
public string EventType { get; }
}
public delegate T SseItemParser<out T>(string eventType, ReadOnlySpan<byte> data);
public static class SseParser
{
public const string EventTypeDefault = "message";
public static SseParser<string> Create(Stream sseStream);
public static SseParser<T> Create<T>(Stream sseStream, SseItemParser<T> itemParser);
}
public sealed class SseParser<T>
{
public IEnumerable<SseItem<T>> Enumerate();
public IAsyncEnumerable<SseItem<T>> EnumerateAsync();
public string LastEventId { get; }
public TimeSpan ReconnectionInterval { get; }
} |
@captainsafia @BrennanConroy Can we make sure to add follow up issues on ASP.NET Core for SignalR and minimal APIs? |
I agree! |
Background and motivation
SSE is becoming more and more popular, especially with prominent services like OpenAI relying on it for streaming responses. The format is very simple, but it still takes some amount of code to properly handle parsing the SSE format. We should have a built-in helper in either System.Net or System.Formats that take care of it for the developer (optionally then other higher-level helpers could be layered on top).
API Proposal
Open Issues
API Usage
Alternative Designs
SseItem<T>
instances onto aStream
, which ASP.NET could then use. This can come later when needed by ASP.NET.Risks
SseItem<T>
type is in the shared libraries, so we need to ensure it's designed accordingly.The text was updated successfully, but these errors were encountered: