Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unexpected keyword argument 'prompt' #4022

Closed
1 task done
fishingcatgo opened this issue Jun 1, 2024 · 4 comments
Closed
1 task done

unexpected keyword argument 'prompt' #4022

fishingcatgo opened this issue Jun 1, 2024 · 4 comments
Labels
solved This problem has been already solved

Comments

@fishingcatgo
Copy link

fishingcatgo commented Jun 1, 2024

Reminder

  • I have read the README and searched the existing issues.

Reproduction

#run
CUDA_VISIBLE_DEVICES=0 llamafactory-cli webchat myconfig/inference/llama3_vllm.yaml

#config
model_name_or_path: /root/autodl-tmp/chat-main/app/serve/model_weight/Qwen-7B
template: qwen
infer_backend: vllm
vllm_enforce_eager: true

#error flow

Running on local URL: http://0.0.0.0:7860

To create a public link, set share=True in launch().
Traceback (most recent call last):
File "/root/condaenv/Qwen/lib/python3.10/site-packages/gradio/queueing.py", line 521, in process_events
response = await route_utils.call_process_api(
File "/root/condaenv/Qwen/lib/python3.10/site-packages/gradio/route_utils.py", line 276, in call_process_api
output = await app.get_blocks().process_api(
File "/root/condaenv/Qwen/lib/python3.10/site-packages/gradio/blocks.py", line 1945, in process_api
result = await self.call_function(
File "/root/condaenv/Qwen/lib/python3.10/site-packages/gradio/blocks.py", line 1525, in call_function
prediction = await utils.async_iteration(iterator)
File "/root/condaenv/Qwen/lib/python3.10/site-packages/gradio/utils.py", line 655, in async_iteration
return await iterator.anext()
File "/root/condaenv/Qwen/lib/python3.10/site-packages/gradio/utils.py", line 648, in anext
return await anyio.to_thread.run_sync(
File "/root/condaenv/Qwen/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/root/condaenv/Qwen/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2177, in run_sync_in_worker_thread
return await future
File "/root/condaenv/Qwen/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 859, in run
result = context.run(func, *args)
File "/root/condaenv/Qwen/lib/python3.10/site-packages/gradio/utils.py", line 631, in run_sync_iterator_async
return next(iterator)
File "/root/condaenv/Qwen/lib/python3.10/site-packages/gradio/utils.py", line 814, in gen_wrapper
response = next(iterator)
File "/root/LLaMA-Factory/src/llamafactory/webui/chatter.py", line 124, in stream
for new_text in self.stream_chat(
File "/root/LLaMA-Factory/src/llamafactory/chat/chat_model.py", line 70, in stream_chat
yield task.result()
File "/root/condaenv/Qwen/lib/python3.10/concurrent/futures/_base.py", line 458, in result
return self.__get_result()
File "/root/condaenv/Qwen/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "/root/LLaMA-Factory/src/llamafactory/chat/chat_model.py", line 82, in astream_chat
async for new_token in self.engine.stream_chat(messages, system, tools, image, **input_kwargs):
File "/root/LLaMA-Factory/src/llamafactory/chat/vllm_engine.py", line 208, in stream_chat
generator = await self._generate(messages, system, tools, image, **input_kwargs)
File "/root/LLaMA-Factory/src/llamafactory/chat/vllm_engine.py", line 160, in _generate
result_generator = self.model.generate(
TypeError: AsyncLLMEngine.generate() got an unexpected keyword argument 'prompt'

Expected behavior

unexpected keyword argument 'prompt'

System Info

unexpected keyword argument 'prompt'

Others

unexpected keyword argument 'prompt'

@xiaochaich
Copy link

image
使用vllm遇到了同样的报错。

@evaZQR
Copy link

evaZQR commented Jun 1, 2024

我也是,有什么解决方法吗

@Appletree24
Copy link

小白目前算是修的能用了,我猜测是vllm 0.4.3版本修改了engine目录下async_llm_engince.py文件中的generate函数参数
测试模型为llama3-8B-Instruct

# vllm新版本下的函数
    async def generate(
        self,
        inputs: PromptInputs,
        sampling_params: SamplingParams,
        request_id: str,
        lora_request: Optional[LoRARequest] = None,
    ) -> AsyncIterator[RequestOutput]:

可以手动将chat目录下vllm_engine.py内的generate函数修改为如下形式

        result_generator = self.model.generate(
            #prompt=None,
            sampling_params=sampling_params,
            request_id=request_id,
            inputs = messages[-1]['content'],
            #prompt_token_ids=prompt_ids,
            lora_request=self.lora_request,
            #multi_modal_data=multi_modal_data,
        )

不过本人是纯小白,不知道这么修改是否合理,只是现在能成功加载模型对话了,还是等作者进行修复吧。
另外也可以尝试降级vllm,但是本人为了环境稳定性没有尝试。

@Zhangzeyu97
Copy link

小白目前算是修的能用了,我猜测是vllm 0.4.3版本修改了engine目录下async_llm_engince.py文件中的generate函数参数 测试模型为llama3-8B-Instruct

# vllm新版本下的函数
    async def generate(
        self,
        inputs: PromptInputs,
        sampling_params: SamplingParams,
        request_id: str,
        lora_request: Optional[LoRARequest] = None,
    ) -> AsyncIterator[RequestOutput]:

可以手动将chat目录下vllm_engine.py内的generate函数修改为如下形式

        result_generator = self.model.generate(
            #prompt=None,
            sampling_params=sampling_params,
            request_id=request_id,
            inputs = messages[-1]['content'],
            #prompt_token_ids=prompt_ids,
            lora_request=self.lora_request,
            #multi_modal_data=multi_modal_data,
        )

不过本人是纯小白,不知道这么修改是否合理,只是现在能成功加载模型对话了,还是等作者进行修复吧。 另外也可以尝试降级vllm,但是本人为了环境稳定性没有尝试。

Downgrading vllm to version 0.4.2 fixed the problem in my environment. Thank you.

@hiyouga hiyouga added the solved This problem has been already solved label Jun 3, 2024
@hiyouga hiyouga closed this as completed in 24e1c0e Jun 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved This problem has been already solved
Projects
None yet
Development

No branches or pull requests

6 participants