Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for streaming text IAsyncEnumerable<string> results #50501

Open
1 task done
alexminza opened this issue Sep 4, 2023 · 6 comments
Open
1 task done

Add support for streaming text IAsyncEnumerable<string> results #50501

alexminza opened this issue Sep 4, 2023 · 6 comments
Labels
area-minimal Includes minimal APIs, endpoint filters, parameter binding, request delegate generator etc

Comments

@alexminza
Copy link

alexminza commented Sep 4, 2023

Is there an existing issue for this?

  • I have searched the existing issues

Is your feature request related to a problem? Please describe the problem.

I am trying to return a streaming IAsyncEnumerable<string> SematicKernel chat completion from GetStreamingChatCompletionsAsync, GetStreamingChatMessageAsync methods.

Currently simply returning IAsyncEnumerable<string> produces a streaming JSON array of strings result. The desired behavior is simple streaming text strings result.

This would effectively produce a streaming ChatGTP-like completion response generated by the method as the results become available from the OpenAI endpoints.

Describe the solution you'd like

public class AsyncEnumerableStringsResult : IResult, IContentTypeHttpResult, IStatusCodeHttpResult
{
    protected readonly IAsyncEnumerable<string> chunks;

    public string? ContentType => "text/plain; charset=utf-8";

    public int StatusCode => StatusCodes.Status200OK;

    int? IStatusCodeHttpResult.StatusCode => StatusCode;

    public AsyncEnumerableStringsResult(IAsyncEnumerable<string> chunks) => this.chunks = chunks ?? throw new ArgumentNullException(nameof(chunks));

    public async Task ExecuteAsync(HttpContext httpContext)
    {
        if (httpContext == null)
            throw new ArgumentNullException(nameof(httpContext));

        httpContext.Response.ContentType = this.ContentType;
        httpContext.Response.StatusCode = this.StatusCode;

        await foreach (var chunk in this.chunks)
            if (!string.IsNullOrEmpty(chunk))
                await httpContext.Response.WriteAsync(chunk, cancellationToken: httpContext.RequestAborted);
    }
}

Usage example:

app.MapPost("/ChatAsyncStream", ([FromBody] ChatRequest chatRequest, ChatPlugin plugin, ILogger logger, CancellationToken cancellationToken) =>
    {
        if (string.IsNullOrWhiteSpace(chatRequest.Question))
            throw new ArgumentNullException(nameof(chatRequest.Question));

        var result = plugin.ChatAsyncStream(
            question: chatRequest.Question,
            chatHistory: chatRequest.ChatHistory,
            logger: logger,
            cancellationToken: cancellationToken
        );

        return new AsyncEnumerableStringsResult(result);
    })
    .WithName("ChatAsyncStream")
    .WithOpenApi()
    .Produces<IAsyncEnumerable<string>>();

Additional context

No response

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions label Sep 4, 2023
@davidfowl
Copy link
Member

How is the client consuming this response? Do you have an example?

@alexminza
Copy link
Author

alexminza commented Sep 5, 2023

@davidfowl here's a simple example built by me, based on the great tutorial from Streamlit https://docs.streamlit.io/knowledge-base/tutorials/build-conversational-apps#write-the-app

Streamlit ChatBot app

#!/usr/bin/env python3

import os, logging
import streamlit as st
from azureapi import AzureAPI

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

azure_api_endpoint = os.getenv('AZUREAPI_ENDPOINT')
azure_api = AzureAPI(endpoint=azure_api_endpoint)

#https://docs.streamlit.io/knowledge-base/tutorials/build-conversational-apps
st.set_page_config(
    page_title="ChatBot",
    page_icon=":robot:"
)

st.title("ChatBot")

# Initialize chat history
if "messages" not in st.session_state:
    st.session_state.messages = []
if "chat_history" not in st.session_state:
    st.session_state.chat_history = ""

# Display chat messages from history on app rerun
for message in st.session_state.messages:
    with st.chat_message(message["role"]):
        st.markdown(message["content"])

# React to user input
if prompt := st.chat_input("Enter message here"):
    # Add user message to chat history
    st.session_state.messages.append({"role": "user", "content": prompt})

    # Display user message in chat message container
    with st.chat_message("user"):
        st.markdown(prompt)

    with st.spinner(text="In progress..."):
        with st.chat_message("assistant"):
            message_placeholder = st.empty()
            response_text = ''

            response_stream = azure_api.ChatStream(question=prompt, chatHistory=st.session_state.chat_history)
            for response_chunk in response_stream:
                if response_chunk:
                    response_text += response_chunk
                    message_placeholder.markdown(response_text + "▌")

            message_placeholder.markdown(response_text)

    # Display assistant response in chat message container
    if response_text:
        # Add assistant response to chat history
        st.session_state.messages.append({"role": "assistant", "content": response_text})
    else:
        st.error("ERROR")

API Client

class AzureAPI:
    SESSION = requests.Session()
    DEFAULT_TIMEOUT = 180
    API_ENDPOINT = None

    def __init__(self, endpoint: str) -> None:
        self.API_ENDPOINT = endpoint

    def ChatStream(self, question: str, chatHistory: str = None):
        url = f'{self.API_ENDPOINT}/ChatAsyncStream'
        json = {
            "question": question,
            "chatHistory": chatHistory
        }

        with AzureAPI.SESSION.request(method='POST', url=url, json=json, stream=True, timeout=AzureAPI.DEFAULT_TIMEOUT) as response:
            yield from response.iter_content(chunk_size=None, decode_unicode=True)

@flq
Copy link

flq commented Dec 10, 2023

A more generic solution might be to have a Server-Sent-Events result object that accepts an IAsyncEnumerable - a client can then consume it via javascript's EventSource class.

Copy link
Contributor

Looks like this PR hasn't been active for some time and the codebase could have been changed in the meantime.
To make sure no conflicting changes have occurred, please rerun validation before merging. You can do this by leaving an /azp run comment here (requires commit rights), or by simply closing and reopening.

@dotnet-policy-service dotnet-policy-service bot added the pending-ci-rerun When assigned to a PR indicates that the CI checks should be rerun label Feb 6, 2024
@wtgodbe wtgodbe removed the pending-ci-rerun When assigned to a PR indicates that the CI checks should be rerun label Feb 6, 2024
Copy link
Contributor

Looks like this PR hasn't been active for some time and the codebase could have been changed in the meantime.
To make sure no conflicting changes have occurred, please rerun validation before merging. You can do this by leaving an /azp run comment here (requires commit rights), or by simply closing and reopening.

@dotnet-policy-service dotnet-policy-service bot added the pending-ci-rerun When assigned to a PR indicates that the CI checks should be rerun label Feb 6, 2024
@davidfowl
Copy link
Member

Related to dotnet/runtime#98105

@davidfowl davidfowl added area-minimal Includes minimal APIs, endpoint filters, parameter binding, request delegate generator etc and removed pending-ci-rerun When assigned to a PR indicates that the CI checks should be rerun area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions labels Feb 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-minimal Includes minimal APIs, endpoint filters, parameter binding, request delegate generator etc
Projects
None yet
Development

No branches or pull requests

4 participants