Streaming: delay in GenerateAsync, or incorrect use? #475
-
Hi, I use the code below to perform a streamed generation. The model is OpenAI, with the streaming setting enabled on it as well. The code works, but I notice a 3-10 second delay before the first response is processed. The delay is proportional to the size of the LLM output. My project is in Blazor WASM, so I was able to confirm this - in the browser debug tools, I can see the response from OpenAI coming in right away and the size of it increases progressively - OpenAI is streaming the response correctly. But it's only when it's completely done that I can start iterating over the partial responses. Could anyone confirm whether I made a mistake in my code, or if this would be an issue with the GenerateAsync code? The readme mentions how to do simple generation queries, but I haven't seen an example with streaming.
|
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
The code looks correct |
Beta Was this translation helpful? Give feedback.
-
Everything seems correct on the latest version in main:
Also please use latest prerelease version to ensure we testing same things |
Beta Was this translation helpful? Give feedback.
Everything seems correct on the latest version in main:
Also please use latest prerelease version to ensure we testing same things