ppo训练报错：RuntimeError: The size of tensor a (3072) must match the size of tensor b (16) at non-singleton dimension 0 #528

wuzechuan · 2023-08-15T16:50:33Z

sft的参数：
CUDA_VISIBLE_DEVICES=0 python src/train_bash.py
--stage sft
--model_name_or_path /data/app/chatglm2-6b
--template chatglm2
--do_train
--dataset sre_train
--dataset_dir data
--finetuning_type lora
--lora_target query_key_value
--output_dir sre/sft
--per_device_train_batch_size 4
--gradient_accumulation_steps 2
--lr_scheduler_type cosine
--logging_steps 10
--save_steps 100
--learning_rate 5e-5
--num_train_epochs 10.0
--plot_loss
--fp16

rm的参数：
CUDA_VISIBLE_DEVICES=0 python src/train_bash.py
--stage rm
--model_name_or_path /data/app/chatglm2-6b
--do_train
--dataset comparison_gpt4_zh
--template chatglm2
--lora_target query_key_value
--finetuning_type lora
--resume_lora_training False
--checkpoint_dir sre/sft
--output_dir sre/rm
--per_device_train_batch_size 4
--gradient_accumulation_steps 4
--lr_scheduler_type cosine
--logging_steps 10
--save_steps 1000
--learning_rate 1e-5
--num_train_epochs 1.0
--plot_loss
--fp16

ppo的参数：
CUDA_VISIBLE_DEVICES=0 python src/train_bash.py
--stage ppo
--model_name_or_path /data/app/chatglm2-6b
--do_train
--dataset sre_train
--template chatglm2
--lora_target query_key_value
--finetuning_type lora
--resume_lora_training False
--checkpoint_dir sre/sft
--reward_model sre/rm
--output_dir sre/ppo
--overwrite_cache
--per_device_train_batch_size 2
--gradient_accumulation_steps 8
--lr_scheduler_type cosine
--logging_steps 10
--save_steps 500
--learning_rate 1e-5
--num_train_epochs 5.0
--plot_loss

报错信息：
[INFO|configuration_utils.py:599] 2023-08-16 00:40:02,205 >> Generate config GenerationConfig {
"_from_model_config": true,
"eos_token_id": 2,
"pad_token_id": 0,
"transformers_version": "4.31.0"
}

0%| | 0/1690 [00:22<?, ?it/s]
Traceback (most recent call last):
File "/data/app/LLaMA-Efficient-Tuning/src/train_bash.py", line 14, in
main()
File "/data/app/LLaMA-Efficient-Tuning/src/train_bash.py", line 5, in main
run_exp()
File "/data/app/LLaMA-Efficient-Tuning/src/llmtuner/tuner/tune.py", line 30, in run_exp
run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
File "/data/app/LLaMA-Efficient-Tuning/src/llmtuner/tuner/ppo/workflow.py", line 75, in run_ppo
ppo_trainer.ppo_train(max_target_length=data_args.max_target_length)
File "/data/app/LLaMA-Efficient-Tuning/src/llmtuner/tuner/ppo/trainer.py", line 106, in ppo_train
stats = self.step(queries, responses, rewards)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/worker/miniconda3/envs/llama/lib/python3.11/contextlib.py", line 81, in inner
return func(*args, **kwds)
^^^^^^^^^^^^^^^^^^^
File "/data/worker/miniconda3/envs/llama/lib/python3.11/site-packages/trl/trainer/ppo_trainer.py", line 680, in step
values, advantages, returns = self.compute_advantages(values, rewards, masks)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/worker/miniconda3/envs/llama/lib/python3.11/site-packages/trl/trainer/ppo_trainer.py", line 1061, in compute_advantages
values = values * mask
~~~~~~~^~~~~~
RuntimeError: The size of tensor a (3072) must match the size of tensor b (16) at non-singleton dimension 0

请问我上面的哪个参数出问题呢？如何解决呢

feifan456 · 2023-08-16T03:14:30Z

遇到了同样的问题

hiyouga · 2023-08-17T16:37:20Z

已修复，请更新代码

hiyouga added the pending This problem is yet to be addressed label Aug 15, 2023

hiyouga added a commit that referenced this issue Aug 17, 2023

fix ChatGLM2 ppo #527 #528

9f4c2ad

hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels Aug 17, 2023

hiyouga closed this as completed Aug 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ppo训练报错：RuntimeError: The size of tensor a (3072) must match the size of tensor b (16) at non-singleton dimension 0 #528

ppo训练报错：RuntimeError: The size of tensor a (3072) must match the size of tensor b (16) at non-singleton dimension 0 #528

wuzechuan commented Aug 15, 2023

feifan456 commented Aug 16, 2023

hiyouga commented Aug 17, 2023

ppo训练报错：RuntimeError: The size of tensor a (3072) must match the size of tensor b (16) at non-singleton dimension 0 #528

ppo训练报错：RuntimeError: The size of tensor a (3072) must match the size of tensor b (16) at non-singleton dimension 0 #528

Comments

wuzechuan commented Aug 15, 2023

feifan456 commented Aug 16, 2023

hiyouga commented Aug 17, 2023