-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DPO训练报错:AttributeError: 'CustomDPOTrainer' object has no attribute 'f_divergence_type' #4742
Comments
Hi, have you solved the problem? I also met this problem. |
@lzzzx666 update the llamafactory |
更新完还是报错了 |
解决了吗? |
已解决 |
你好,请问怎么解决的呢 |
|
在哪里加这一行 |
@LiJianmin6706 在dpo trainer的文件里面,class CustomDPOTrainer(DPOTrainer) 这个类里面的init函数里添加即可 |
感谢,然而我看到我下载的仓库代码里已经有这一行了,运行的时候,仍然报相同的错误 |
@LiJianmin6706 这个self.f_divergence_type 是从trl里面的dpo trainer继承过来的,你看看是不是transformers之类的库版本的问题 |
Reminder
System Info
llamafactory
version: 0.8.3.dev0Reproduction
Traceback (most recent call last):
File "/home/tiger/.local/bin/llamafactory-cli", line 8, in
sys.exit(main())
File "/mnt/bn/aigc-t2i/lifanshi/code/LLaMA-Factory/src/llamafactory/cli.py", line 111, in main
run_exp()
File "/mnt/bn/aigc-t2i/lifanshi/code/LLaMA-Factory/src/llamafactory/train/tuner.py", line 56, in run_exp
run_dpo(model_args, data_args, training_args, finetuning_args, callbacks)
File "/mnt/bn/aigc-t2i/lifanshi/code/LLaMA-Factory/src/llamafactory/train/dpo/workflow.py", line 79, in run_dpo
train_result = trainer.train(resume_from_checkpoint=training_args.resume_from_checkpoint)
File "/home/tiger/.local/lib/python3.9/site-packages/transformers/trainer.py", line 1932, in train
return inner_training_loop(
File "/home/tiger/.local/lib/python3.9/site-packages/transformers/trainer.py", line 2268, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/home/tiger/.local/lib/python3.9/site-packages/transformers/trainer.py", line 3307, in training_step
loss = self.compute_loss(model, inputs)
File "/home/tiger/.local/lib/python3.9/site-packages/trl/trainer/dpo_trainer.py", line 1408, in compute_loss
loss, metrics = self.get_batch_loss_metrics(model, inputs, train_eval="train")
File "/mnt/bn/aigc-t2i/lifanshi/code/LLaMA-Factory/src/llamafactory/train/dpo/trainer.py", line 229, in get_batch_loss_metrics
losses, chosen_rewards, rejected_rewards = self.compute_preference_loss(
File "/mnt/bn/aigc-t2i/lifanshi/code/LLaMA-Factory/src/llamafactory/train/dpo/trainer.py", line 160, in compute_preference_loss
losses, chosen_rewards, rejected_rewards = self.dpo_loss(
File "/home/tiger/.local/lib/python3.9/site-packages/trl/trainer/dpo_trainer.py", line 1073, in dpo_loss
if self.f_divergence_type == FDivergenceType.ALPHA_DIVERGENCE.value:
AttributeError: 'CustomDPOTrainer' object has no attribute 'f_divergence_type'
Traceback (most recent call last):
File "/home/tiger/.local/bin/llamafactory-cli", line 8, in
sys.exit(main())
File "/mnt/bn/aigc-t2i/lifanshi/code/LLaMA-Factory/src/llamafactory/cli.py", line 111, in main
run_exp()
File "/mnt/bn/aigc-t2i/lifanshi/code/LLaMA-Factory/src/llamafactory/train/tuner.py", line 56, in run_exp
run_dpo(model_args, data_args, training_args, finetuning_args, callbacks)
File "/mnt/bn/aigc-t2i/lifanshi/code/LLaMA-Factory/src/llamafactory/train/dpo/workflow.py", line 79, in run_dpo
train_result = trainer.train(resume_from_checkpoint=training_args.resume_from_checkpoint)
File "/home/tiger/.local/lib/python3.9/site-packages/transformers/trainer.py", line 1932, in train
return inner_training_loop(
File "/home/tiger/.local/lib/python3.9/site-packages/transformers/trainer.py", line 2268, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/home/tiger/.local/lib/python3.9/site-packages/transformers/trainer.py", line 3307, in training_step
loss = self.compute_loss(model, inputs)
File "/home/tiger/.local/lib/python3.9/site-packages/trl/trainer/dpo_trainer.py", line 1408, in compute_loss
loss, metrics = self.get_batch_loss_metrics(model, inputs, train_eval="train")
File "/mnt/bn/aigc-t2i/lifanshi/code/LLaMA-Factory/src/llamafactory/train/dpo/trainer.py", line 229, in get_batch_loss_metrics
losses, chosen_rewards, rejected_rewards = self.compute_preference_loss(
File "/mnt/bn/aigc-t2i/lifanshi/code/LLaMA-Factory/src/llamafactory/train/dpo/trainer.py", line 160, in compute_preference_loss
losses, chosen_rewards, rejected_rewards = self.dpo_loss(
File "/home/tiger/.local/lib/python3.9/site-packages/trl/trainer/dpo_trainer.py", line 1073, in dpo_loss
if self.f_divergence_type == FDivergenceType.ALPHA_DIVERGENCE.value:
AttributeError: 'CustomDPOTrainer' object has no attribute 'f_divergence_type'
Expected behavior
训练dpo
Others
No response
The text was updated successfully, but these errors were encountered: