微调后的模型如何多卡推理 #3769

cl12191718 · 2024-05-16T07:19:10Z

Reminder

I have read the README and searched the existing issues.

Reproduction

使用命令CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/qlora_single_gpu/llama3_lora_sft_bitsandbytes.yaml训练后的模型，如何使用多卡推理？
CUDA_VISIBLE_DEVICES=0,1,2 llamafactory-cli webchat examples/merge_lora/llama3_lora_sft.yaml通过这个脚本指定CUDA_VISIBLE_DEVICES发现并未生效

Expected behavior

No response

System Info

No response

Others

No response

The text was updated successfully, but these errors were encountered:

hiyouga · 2024-05-16T11:12:36Z

之前的示例脚本有误，已更新： https://github.com/hiyouga/LLaMA-Factory/blob/main/examples/README_zh.md#%E6%8E%A8%E7%90%86-lora-%E6%A8%A1%E5%9E%8B

cnpioneer · 2024-05-21T01:28:21Z

之前的示例脚本有误，已更新： https://github.com/hiyouga/LLaMA-Factory/blob/main/examples/README_zh.md#%E6%8E%A8%E7%90%86-lora-%E6%A8%A1%E5%9E%8B

还是不行，使用hf_engine，拉取了最新版本，按照说明操作还是只能单卡启动推理

hiyouga added a commit that referenced this issue May 16, 2024

fix examples #3769

3df986c

hiyouga added the solved This problem has been already solved label May 16, 2024

hiyouga closed this as completed May 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

微调后的模型如何多卡推理 #3769

微调后的模型如何多卡推理 #3769

cl12191718 commented May 16, 2024 •

edited

Loading

hiyouga commented May 16, 2024

cnpioneer commented May 21, 2024

微调后的模型如何多卡推理 #3769

微调后的模型如何多卡推理 #3769

Comments

cl12191718 commented May 16, 2024 • edited Loading

Reminder

Reproduction

Expected behavior

System Info

Others

hiyouga commented May 16, 2024

cnpioneer commented May 21, 2024

cl12191718 commented May 16, 2024 •

edited

Loading