-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ppo合并失败 #4609
Labels
solved
This problem has been already solved
Comments
测试发现,是多卡环境下保存的权重有问题,但不知道为什么 |
我也碰到了跟你一样的问题,跑ppo保存模型权重有问题。我在5台机器上面都用一样的代码跑了,发现能正常保存出模型权重的环境里CUDA版本是11的,保存有问题的CUDA版本是12。 |
我这边cuda是11.8,我单卡和多卡环境是一致的呢 |
PPO这个还是有点问题,有的人能跑通有的不行 |
修复了 |
hiyouga
added
solved
This problem has been already solved
and removed
pending
This problem is yet to be addressed
labels
Jul 3, 2024
感觉多卡环境下保存adapter_model.safetensors还是有问题, |
xtchen96
pushed a commit
to xtchen96/LLaMA-Factory
that referenced
this issue
Jul 17, 2024
unwrap_model_for_generation(reward_model) is necessary for zero3 training
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Reminder
System Info
训完ppo想和base model 合并
06/28/2024 10:37:27 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation.
Traceback (most recent call last):
File "/root/anaconda3/envs/llm/bin/llamafactory-cli", line 8, in
sys.exit(main())
File "/root/workspace/project/llm/LLaMA-Factory/src/llamafactory/cli.py", line 87, in main
export_model()
File "/root/workspace/project/llm/LLaMA-Factory/src/llamafactory/train/tuner.py", line 73, in export_model
model = load_model(tokenizer, model_args, finetuning_args) # must after fixing tokenizer to resize vocab
File "/root/workspace/project/llm/LLaMA-Factory/src/llamafactory/model/loader.py", line 160, in load_model
model = init_adapter(config, model, model_args, finetuning_args, is_trainable)
File "/root/workspace/project/llm/LLaMA-Factory/src/llamafactory/model/adapter.py", line 311, in init_adapter
model = _setup_lora_tuning(
File "/root/workspace/project/llm/LLaMA-Factory/src/llamafactory/model/adapter.py", line 191, in _setup_lora_tuning
model: "LoraModel" = PeftModel.from_pretrained(model, adapter, **init_kwargs)
File "/root/anaconda3/envs/llm/lib/python3.9/site-packages/peft/peft_model.py", line 430, in from_pretrained
model.load_adapter(model_id, adapter_name, is_trainable=is_trainable, **kwargs)
File "/root/anaconda3/envs/llm/lib/python3.9/site-packages/peft/peft_model.py", line 984, in load_adapter
adapters_weights = load_peft_weights(model_id, device=torch_device, **hf_hub_download_kwargs)
File "/root/anaconda3/envs/llm/lib/python3.9/site-packages/peft/utils/save_and_load.py", line 444, in load_peft_weights
adapters_weights = safe_load_file(filename, device=device)
File "/root/anaconda3/envs/llm/lib/python3.9/site-packages/safetensors/torch.py", line 311, in load_file
with safe_open(filename, framework="pt", device=device) as f:
safetensors_rust.SafetensorError: Error while deserializing header: InvalidHeaderDeserialization
Reproduction
model
model_name_or_path: model_zoos/shenzhi-wang/Llama3-8B-Chinese-Chat
adapter_name_or_path: saves/llama3-8b/lora/ppo_fdc/checkpoint-160
template: llama3
finetuning_type: lora
export
export_dir: saves/llama3-8b/lora/ppo_fdc_model
export_size: 2
export_device: cpu
export_legacy_format: false
Expected behavior
No response
Others
No response
The text was updated successfully, but these errors were encountered: