-
Notifications
You must be signed in to change notification settings - Fork 710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add lora_path to chat completion #2438
Conversation
Even if you added this, it seems it is still not used. |
We are working on a project to provide multi-lora serving via OpenAI compatible API, and we have validated this fix by adding lora_path to OpenAI protocol and serving a batch with different lora adapters. |
Thanks for merging this change! |
Hello, how to use url = "http://localhost:8000/v1/chat/completions" to request the configured lora, data = {"model": "Qwen2.5-7B-Instruct","messages": [{ "role": "user", "content": "What is the capital of France?"}]},lora name='aa' |
curl -X POST http://127.0.0.1:30000/v1/chat/completions -d '{"model": "meta-llama/Llama-3.2-1B", "messages": [{"role": "system", "content": "You are a happy assistant that puts a positive spin on everything."}, {"role": "user", "content": "I fell off my bike today."}], "lora_path": "lora1", "max_tokens": 64}' |
v1_completions can successfully call lora, but v1/chat/completions cannot call lora, why?Comparing v1_generate_request and v1_chat_generate_request, we found that v1_chat_generate_request does not have the lora_pathes variable |
You are right. I missed that when cherry picking changes. Will have another PR. |
|
Motivation
Add lora_path to ChatCompletionRequest for OpenAI chat completion API. It was previously added to OpenAI completion API #2243
Modifications
Added lora_path to ChatCompletionRequest
Checklist