Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DPO训练报错:AttributeError: 'CustomDPOTrainer' object has no attribute 'f_divergence_type' #4742

Closed
1 task done
brillianti opened this issue Jul 9, 2024 · 11 comments
Labels
solved This problem has been already solved

Comments

@brillianti
Copy link

brillianti commented Jul 9, 2024

Reminder

  • I have read the README and searched the existing issues.

System Info

  • llamafactory version: 0.8.3.dev0
  • Platform: Linux-5.4.143.bsk.7-amd64-x86_64-with-glibc2.31
  • Python version: 3.9.2
  • PyTorch version: 2.1.0+cu121 (GPU)
  • Transformers version: 4.42.3
  • Datasets version: 2.20.0
  • Accelerate version: 0.32.1
  • PEFT version: 0.11.1
  • TRL version: 0.9.6
  • GPU type: NVIDIA A100-SXM4-80GB
  • DeepSpeed version: 0.8.3

Reproduction

Traceback (most recent call last):
File "/home/tiger/.local/bin/llamafactory-cli", line 8, in
sys.exit(main())
File "/mnt/bn/aigc-t2i/lifanshi/code/LLaMA-Factory/src/llamafactory/cli.py", line 111, in main
run_exp()
File "/mnt/bn/aigc-t2i/lifanshi/code/LLaMA-Factory/src/llamafactory/train/tuner.py", line 56, in run_exp
run_dpo(model_args, data_args, training_args, finetuning_args, callbacks)
File "/mnt/bn/aigc-t2i/lifanshi/code/LLaMA-Factory/src/llamafactory/train/dpo/workflow.py", line 79, in run_dpo
train_result = trainer.train(resume_from_checkpoint=training_args.resume_from_checkpoint)
File "/home/tiger/.local/lib/python3.9/site-packages/transformers/trainer.py", line 1932, in train
return inner_training_loop(
File "/home/tiger/.local/lib/python3.9/site-packages/transformers/trainer.py", line 2268, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/home/tiger/.local/lib/python3.9/site-packages/transformers/trainer.py", line 3307, in training_step
loss = self.compute_loss(model, inputs)
File "/home/tiger/.local/lib/python3.9/site-packages/trl/trainer/dpo_trainer.py", line 1408, in compute_loss
loss, metrics = self.get_batch_loss_metrics(model, inputs, train_eval="train")
File "/mnt/bn/aigc-t2i/lifanshi/code/LLaMA-Factory/src/llamafactory/train/dpo/trainer.py", line 229, in get_batch_loss_metrics
losses, chosen_rewards, rejected_rewards = self.compute_preference_loss(
File "/mnt/bn/aigc-t2i/lifanshi/code/LLaMA-Factory/src/llamafactory/train/dpo/trainer.py", line 160, in compute_preference_loss
losses, chosen_rewards, rejected_rewards = self.dpo_loss(
File "/home/tiger/.local/lib/python3.9/site-packages/trl/trainer/dpo_trainer.py", line 1073, in dpo_loss
if self.f_divergence_type == FDivergenceType.ALPHA_DIVERGENCE.value:
AttributeError: 'CustomDPOTrainer' object has no attribute 'f_divergence_type'
Traceback (most recent call last):
File "/home/tiger/.local/bin/llamafactory-cli", line 8, in
sys.exit(main())
File "/mnt/bn/aigc-t2i/lifanshi/code/LLaMA-Factory/src/llamafactory/cli.py", line 111, in main
run_exp()
File "/mnt/bn/aigc-t2i/lifanshi/code/LLaMA-Factory/src/llamafactory/train/tuner.py", line 56, in run_exp
run_dpo(model_args, data_args, training_args, finetuning_args, callbacks)
File "/mnt/bn/aigc-t2i/lifanshi/code/LLaMA-Factory/src/llamafactory/train/dpo/workflow.py", line 79, in run_dpo
train_result = trainer.train(resume_from_checkpoint=training_args.resume_from_checkpoint)
File "/home/tiger/.local/lib/python3.9/site-packages/transformers/trainer.py", line 1932, in train
return inner_training_loop(
File "/home/tiger/.local/lib/python3.9/site-packages/transformers/trainer.py", line 2268, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/home/tiger/.local/lib/python3.9/site-packages/transformers/trainer.py", line 3307, in training_step
loss = self.compute_loss(model, inputs)
File "/home/tiger/.local/lib/python3.9/site-packages/trl/trainer/dpo_trainer.py", line 1408, in compute_loss
loss, metrics = self.get_batch_loss_metrics(model, inputs, train_eval="train")
File "/mnt/bn/aigc-t2i/lifanshi/code/LLaMA-Factory/src/llamafactory/train/dpo/trainer.py", line 229, in get_batch_loss_metrics
losses, chosen_rewards, rejected_rewards = self.compute_preference_loss(
File "/mnt/bn/aigc-t2i/lifanshi/code/LLaMA-Factory/src/llamafactory/train/dpo/trainer.py", line 160, in compute_preference_loss
losses, chosen_rewards, rejected_rewards = self.dpo_loss(
File "/home/tiger/.local/lib/python3.9/site-packages/trl/trainer/dpo_trainer.py", line 1073, in dpo_loss
if self.f_divergence_type == FDivergenceType.ALPHA_DIVERGENCE.value:
AttributeError: 'CustomDPOTrainer' object has no attribute 'f_divergence_type'

Expected behavior

训练dpo

Others

No response

@github-actions github-actions bot added the pending This problem is yet to be addressed label Jul 9, 2024
@hiyouga hiyouga closed this as completed in 2f09520 Jul 9, 2024
@hiyouga hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels Jul 9, 2024
@lzzzx666
Copy link

Hi, have you solved the problem? I also met this problem.

@hiyouga
Copy link
Owner

hiyouga commented Jul 10, 2024

@lzzzx666 update the llamafactory

@2481241414
Copy link

@lzzzx666 update the llamafactory

更新完还是报错了

xtchen96 pushed a commit to xtchen96/LLaMA-Factory that referenced this issue Jul 17, 2024
@aijianiula0601
Copy link

解决了吗?

@lzzzx666
Copy link

解决了吗?

已解决

@jianhai0527
Copy link

解决了吗?

已解决

你好,请问怎么解决的呢

@lzzzx666
Copy link

解决了吗?

已解决

你好,请问怎么解决的呢
更新repo即可,我原来运行的时候是少了一行 self.f_divergence_type = "reverse_kl"

@LiJianmin6706
Copy link

解决了吗?

已解决

你好,请问怎么解决的呢
更新repo即可,我原来运行的时候是少了一行 self.f_divergence_type = "reverse_kl"

在哪里加这一行

@lzzzx666
Copy link

@LiJianmin6706 在dpo trainer的文件里面,class CustomDPOTrainer(DPOTrainer) 这个类里面的init函数里添加即可

@LiJianmin6706
Copy link

@LiJianmin6706 在dpo trainer的文件里面,class CustomDPOTrainer(DPOTrainer) 这个类里面的init函数里添加即可

感谢,然而我看到我下载的仓库代码里已经有这一行了,运行的时候,仍然报相同的错误

@lzzzx666
Copy link

@LiJianmin6706 这个self.f_divergence_type 是从trl里面的dpo trainer继承过来的,你看看是不是transformers之类的库版本的问题

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved This problem has been already solved
Projects
None yet
Development

No branches or pull requests

7 participants