-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
0.8.1版本DeepSpeed 的 zero stage3报错 #4209
Labels
solved
This problem has been already solved
Comments
去掉参数:
|
hiyouga
added
solved
This problem has been already solved
and removed
pending
This problem is yet to be addressed
labels
Jun 11, 2024
去掉参数后仍然出现这个问题,尝试了deepspeed==0.13.0和0.14.0 @hiyouga |
确实还有一样的问题,大佬在帮忙看看吧 |
我也有这个问题 |
fixed |
你好,是更新了版本后,DPO和KTO能够正常训练吗? |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Reminder
System Info
通过DeepSpeed训练Qwen1.5-1.8B,使用Zero2可以正常训练,但是使用Zero3报错。
Reproduction
执行命令:
CUDA_VISIBLE_DEVICES=5,6 llamafactory-cli train examples/lora_multi_gpu/qwen_lora_dpo_ds.yaml
训练配置文件:
Expected behavior
报错如下:
Others
No response
The text was updated successfully, but these errors were encountered: