io-ray-serve-chat-demo

a Ray Serve Chat Demo Serving Hugging Face Models

How to get started

Open Up io.net account
Follow through standard procedure on launching a Ray Cluster. Select a small cluster, for example 4 T4.
When the cluster is ready, select Visual Studio Code (VSCode)
Launch Visual studio code terminal and clone this repo

git clone https://github.com/ionet-official/io-ray-serve-chat-demo.git

Go to the folder

cd  io-ray-serve-chat-demo

Start the chat server via

serve run chat.yaml

Wait till the Ray serve deploys the chat app across workers. You will see on the terminal a "Model loaded" message.
Test your Chatbot from the cluster. Open a new terminal and run the sample chat client

python chat_client.py

Test your Chatbot server endpoint from outside the Cluster
1. Server endpoint: https://exposed-service-[YOUR-CLUSTER-SUFFIX].tunnels.io.systems/
2. If your cluster suffix is 1d47a, then: https://exposed-service-1d47a.tunnels.io.systems/
3. One way to identify your prefix is from the the VSCode URL, which looks like https://vscode-1d47a.tunnels.io.systems/
4. You can use below code snippet to interact with the Ray serve application created (update the endpoint to your server)

import requests
SERVER_ENDPOINT = "https://exposed-service-1d47a.tunnels.io.systems/"
message = "What is the capital of France?"
history = []
response = requests.post(SERVER_ENDPOINT, json={"user_input": message, "history": history})
print(response.json())

or on a terminal:

curl -X POST https://exposed-service-1d47a.tunnels.io.systems/ \
-H "Content-Type: application/json" \
-d '{"user_input": "What is the capital of France?", "history": []}'

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

io-ray-serve-chat-demo

How to get started

Files

README.md

Latest commit

History

README.md

File metadata and controls

io-ray-serve-chat-demo

How to get started