-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PPO Not Working with DeepSpeed stage ZeRO-3 #3108
Labels
solved
This problem has been already solved
Comments
这个问题新版框架还存在,什么时候可以解决呀 |
Any update on this issue @hiyouga QwQ |
Same problem |
same problem with Supervised Fine-Tuning with DeepSpeed ZeRO-3 (Weight Sharding)CUDA_VISIBLE_DEVICES=0,1,2,3 llamafactory-cli train examples/lora_multi_gpu/llama3_lora_sft_ds.yaml |
fixed |
hiyouga
added
solved
This problem has been already solved
and removed
pending
This problem is yet to be addressed
labels
Jun 6, 2024
@hiyouga hi,can you tell me how you solved the problem? Thanks! |
1 task
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Reminder
Reproduction
Generate() step is failing during PPO with LLaMA 70B + LoRA. I'm using DeepSpeed ZeRO-3 and I've tried with and without offloading, and with and without grad accumulation. Is the model not being unwrapped correctly? When I print the state dictionary of the unwrapped model (and also unwrapped_model.pretrained_model.state_dict()), I get that the following tensor, which is not 2D:
('base_model.model.model.embed_tokens.weight', tensor([], device='cuda:7', dtype=torch.bfloat16))
. This probably indicates the that embed_tokens weight is being split across multiple GPUs with ZeRO stage 3. Is there a way to fix this?Here's the model config
Expected behavior
No response
System Info
version v0.6.1
Others
The text was updated successfully, but these errors were encountered: