Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

微调后的模型如何多卡推理 #3769

Closed
1 task done
cl12191718 opened this issue May 16, 2024 · 2 comments
Closed
1 task done

微调后的模型如何多卡推理 #3769

cl12191718 opened this issue May 16, 2024 · 2 comments
Labels
solved This problem has been already solved

Comments

@cl12191718
Copy link

cl12191718 commented May 16, 2024

Reminder

  • I have read the README and searched the existing issues.

Reproduction

使用命令CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/qlora_single_gpu/llama3_lora_sft_bitsandbytes.yaml训练后的模型,如何使用多卡推理?
CUDA_VISIBLE_DEVICES=0,1,2 llamafactory-cli webchat examples/merge_lora/llama3_lora_sft.yaml通过这个脚本指定CUDA_VISIBLE_DEVICES发现并未生效

Expected behavior

No response

System Info

No response

Others

No response

hiyouga added a commit that referenced this issue May 16, 2024
@hiyouga
Copy link
Owner

hiyouga commented May 16, 2024

@hiyouga hiyouga added the solved This problem has been already solved label May 16, 2024
@hiyouga hiyouga closed this as completed May 16, 2024
@cnpioneer
Copy link

之前的示例脚本有误,已更新: https://github.com/hiyouga/LLaMA-Factory/blob/main/examples/README_zh.md#%E6%8E%A8%E7%90%86-lora-%E6%A8%A1%E5%9E%8B

还是不行,使用hf_engine,拉取了最新版本,按照说明操作还是只能单卡启动推理

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved This problem has been already solved
Projects
None yet
Development

No branches or pull requests

3 participants