Streaming: delay in GenerateAsync, or incorrect use? #475

MetacodeX7 · 2024-11-14T14:26:22Z

MetacodeX7
Nov 14, 2024

Hi,

I use the code below to perform a streamed generation. The model is OpenAI, with the streaming setting enabled on it as well. The code works, but I notice a 3-10 second delay before the first response is processed. The delay is proportional to the size of the LLM output.

My project is in Blazor WASM, so I was able to confirm this - in the browser debug tools, I can see the response from OpenAI coming in right away and the size of it increases progressively - OpenAI is streaming the response correctly. But it's only when it's completely done that I can start iterating over the partial responses.

Could anyone confirm whether I made a mistake in my code, or if this would be an issue with the GenerateAsync code? The readme mentions how to do simple generation queries, but I haven't seen an example with streaming.

ChatRequest chatRequest = PrepareChatRequest(request);
ChatSettings chatSettings = new ChatSettings();
chatSettings.UseStreaming = true;
                
IAsyncEnumerable<ChatResponse> asyncResponses = chatModel.GenerateAsync(chatRequest, chatSettings);
await foreach(ChatResponse partialResponse in asyncResponses)
{
   // My code
}

Answered by HavenDV

Nov 14, 2024

Everything seems correct on the latest version in main:

00:00:00.1032343. RequestSent: Human: Answer me five hundreds random words

00:00:01.4456246. DeltaReceived: 
00:00:01.4460775. DeltaReceived: Sure
00:00:01.4463671. DeltaReceived: !
00:00:01.4463854. DeltaReceived:  Here
...
00:00:36.8203745. DeltaReceived: 499
00:00:36.9014720. DeltaReceived: .
00:00:36.9015002. DeltaReceived:  Mosaic

Also please use latest prerelease version to ensure we testing same things

View full answer

HavenDV · 2024-11-14T14:30:17Z

HavenDV
Nov 14, 2024
Maintainer

The code looks correct
I'll check the distribution of responses over time in our tests, maybe that will give some clues

0 replies

HavenDV · 2024-11-14T16:06:15Z

HavenDV
Nov 14, 2024
Maintainer

Everything seems correct on the latest version in main:

00:00:00.1032343. RequestSent: Human: Answer me five hundreds random words

00:00:01.4456246. DeltaReceived: 
00:00:01.4460775. DeltaReceived: Sure
00:00:01.4463671. DeltaReceived: !
00:00:01.4463854. DeltaReceived:  Here
...
00:00:36.8203745. DeltaReceived: 499
00:00:36.9014720. DeltaReceived: .
00:00:36.9015002. DeltaReceived:  Mosaic

Also please use latest prerelease version to ensure we testing same things

1 reply

MetacodeX7 Nov 16, 2024
Author

Thanks for checking! I tried the prerelease, no luck. I modified my code to run almost exactly the same logic as the tests, but no luck either. I guess there is a more complex problem on my end, maybe an issue with .net 7? I will investigate more, and compare different setups. Thank you for your help.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streaming: delay in GenerateAsync, or incorrect use? #475

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Streaming: delay in GenerateAsync, or incorrect use? #475

MetacodeX7 Nov 14, 2024

Replies: 2 comments · 1 reply

HavenDV Nov 14, 2024 Maintainer

HavenDV Nov 14, 2024 Maintainer

MetacodeX7 Nov 16, 2024 Author

MetacodeX7
Nov 14, 2024

Replies: 2 comments 1 reply

HavenDV
Nov 14, 2024
Maintainer

HavenDV
Nov 14, 2024
Maintainer

MetacodeX7 Nov 16, 2024
Author