Data type error reported during vllm inference #3347

zhangapeng · 2024-04-19T08:24:34Z

Reminder

I have read the README and searched the existing issues.

Reproduction

CUDA_VISIBLE_DEVICES=0 API_PORT=8000 python src/api_demo.py
--model_name_or_path /home/Baichuan2/Baichuan2-7B-Chat
--template baichuan2
--finetuning_type lora
--infer_backend vllm

When running on the v100 machine, an error occurs that BFfloat is not supported. Solution: Add parameter dtype='float16' in line 36 of src/src/llmtuner/chat/vllm_engine.py

Expected behavior

No response

System Info

No response

Others

No response

hiyouga · 2024-04-23T17:30:35Z

fixed

hiyouga added the pending This problem is yet to be addressed label Apr 19, 2024

hiyouga closed this as completed in 707f0b1 Apr 23, 2024

hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels Apr 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data type error reported during vllm inference #3347

Data type error reported during vllm inference #3347

zhangapeng commented Apr 19, 2024

hiyouga commented Apr 23, 2024

Data type error reported during vllm inference #3347

Data type error reported during vllm inference #3347

Comments

zhangapeng commented Apr 19, 2024

Reminder

Reproduction

Expected behavior

System Info

Others

hiyouga commented Apr 23, 2024