DPO训练报错：AttributeError: 'CustomDPOTrainer' object has no attribute 'f_divergence_type' #4742

brillianti · 2024-07-09T12:50:23Z

Reminder

I have read the README and searched the existing issues.

System Info

llamafactory version: 0.8.3.dev0
Platform: Linux-5.4.143.bsk.7-amd64-x86_64-with-glibc2.31
Python version: 3.9.2
PyTorch version: 2.1.0+cu121 (GPU)
Transformers version: 4.42.3
Datasets version: 2.20.0
Accelerate version: 0.32.1
PEFT version: 0.11.1
TRL version: 0.9.6
GPU type: NVIDIA A100-SXM4-80GB
DeepSpeed version: 0.8.3

Reproduction

Traceback (most recent call last):
File "/home/tiger/.local/bin/llamafactory-cli", line 8, in
sys.exit(main())
File "/mnt/bn/aigc-t2i/lifanshi/code/LLaMA-Factory/src/llamafactory/cli.py", line 111, in main
run_exp()
File "/mnt/bn/aigc-t2i/lifanshi/code/LLaMA-Factory/src/llamafactory/train/tuner.py", line 56, in run_exp
run_dpo(model_args, data_args, training_args, finetuning_args, callbacks)
File "/mnt/bn/aigc-t2i/lifanshi/code/LLaMA-Factory/src/llamafactory/train/dpo/workflow.py", line 79, in run_dpo
train_result = trainer.train(resume_from_checkpoint=training_args.resume_from_checkpoint)
File "/home/tiger/.local/lib/python3.9/site-packages/transformers/trainer.py", line 1932, in train
return inner_training_loop(
File "/home/tiger/.local/lib/python3.9/site-packages/transformers/trainer.py", line 2268, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/home/tiger/.local/lib/python3.9/site-packages/transformers/trainer.py", line 3307, in training_step
loss = self.compute_loss(model, inputs)
File "/home/tiger/.local/lib/python3.9/site-packages/trl/trainer/dpo_trainer.py", line 1408, in compute_loss
loss, metrics = self.get_batch_loss_metrics(model, inputs, train_eval="train")
File "/mnt/bn/aigc-t2i/lifanshi/code/LLaMA-Factory/src/llamafactory/train/dpo/trainer.py", line 229, in get_batch_loss_metrics
losses, chosen_rewards, rejected_rewards = self.compute_preference_loss(
File "/mnt/bn/aigc-t2i/lifanshi/code/LLaMA-Factory/src/llamafactory/train/dpo/trainer.py", line 160, in compute_preference_loss
losses, chosen_rewards, rejected_rewards = self.dpo_loss(
File "/home/tiger/.local/lib/python3.9/site-packages/trl/trainer/dpo_trainer.py", line 1073, in dpo_loss
if self.f_divergence_type == FDivergenceType.ALPHA_DIVERGENCE.value:
AttributeError: 'CustomDPOTrainer' object has no attribute 'f_divergence_type'
Traceback (most recent call last):
File "/home/tiger/.local/bin/llamafactory-cli", line 8, in
sys.exit(main())
File "/mnt/bn/aigc-t2i/lifanshi/code/LLaMA-Factory/src/llamafactory/cli.py", line 111, in main
run_exp()
File "/mnt/bn/aigc-t2i/lifanshi/code/LLaMA-Factory/src/llamafactory/train/tuner.py", line 56, in run_exp
run_dpo(model_args, data_args, training_args, finetuning_args, callbacks)
File "/mnt/bn/aigc-t2i/lifanshi/code/LLaMA-Factory/src/llamafactory/train/dpo/workflow.py", line 79, in run_dpo
train_result = trainer.train(resume_from_checkpoint=training_args.resume_from_checkpoint)
File "/home/tiger/.local/lib/python3.9/site-packages/transformers/trainer.py", line 1932, in train
return inner_training_loop(
File "/home/tiger/.local/lib/python3.9/site-packages/transformers/trainer.py", line 2268, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/home/tiger/.local/lib/python3.9/site-packages/transformers/trainer.py", line 3307, in training_step
loss = self.compute_loss(model, inputs)
File "/home/tiger/.local/lib/python3.9/site-packages/trl/trainer/dpo_trainer.py", line 1408, in compute_loss
loss, metrics = self.get_batch_loss_metrics(model, inputs, train_eval="train")
File "/mnt/bn/aigc-t2i/lifanshi/code/LLaMA-Factory/src/llamafactory/train/dpo/trainer.py", line 229, in get_batch_loss_metrics
losses, chosen_rewards, rejected_rewards = self.compute_preference_loss(
File "/mnt/bn/aigc-t2i/lifanshi/code/LLaMA-Factory/src/llamafactory/train/dpo/trainer.py", line 160, in compute_preference_loss
losses, chosen_rewards, rejected_rewards = self.dpo_loss(
File "/home/tiger/.local/lib/python3.9/site-packages/trl/trainer/dpo_trainer.py", line 1073, in dpo_loss
if self.f_divergence_type == FDivergenceType.ALPHA_DIVERGENCE.value:
AttributeError: 'CustomDPOTrainer' object has no attribute 'f_divergence_type'

Expected behavior

训练dpo

Others

No response

The text was updated successfully, but these errors were encountered:

lzzzx666 · 2024-07-10T02:17:56Z

Hi, have you solved the problem? I also met this problem.

hiyouga · 2024-07-10T02:42:40Z

@lzzzx666 update the llamafactory

2481241414 · 2024-07-12T02:18:04Z

@lzzzx666 update the llamafactory

更新完还是报错了

aijianiula0601 · 2024-07-23T08:19:38Z

解决了吗？

lzzzx666 · 2024-07-23T13:11:43Z

解决了吗？

已解决

jianhai0527 · 2024-07-24T05:35:44Z

解决了吗？

已解决

你好，请问怎么解决的呢

lzzzx666 · 2024-07-24T08:08:55Z

解决了吗？

已解决

你好，请问怎么解决的呢
更新repo即可，我原来运行的时候是少了一行 self.f_divergence_type = "reverse_kl"

LiJianmin6706 · 2024-08-30T09:10:40Z

解决了吗？

已解决

你好，请问怎么解决的呢
更新repo即可，我原来运行的时候是少了一行 self.f_divergence_type = "reverse_kl"

在哪里加这一行

lzzzx666 · 2024-08-30T09:13:38Z

@LiJianmin6706 在dpo trainer的文件里面，class CustomDPOTrainer(DPOTrainer) 这个类里面的init函数里添加即可

LiJianmin6706 · 2024-08-30T09:17:09Z

@LiJianmin6706 在dpo trainer的文件里面，class CustomDPOTrainer(DPOTrainer) 这个类里面的init函数里添加即可

感谢，然而我看到我下载的仓库代码里已经有这一行了，运行的时候，仍然报相同的错误

lzzzx666 · 2024-08-30T09:21:04Z

@LiJianmin6706 这个self.f_divergence_type 是从trl里面的dpo trainer继承过来的，你看看是不是transformers之类的库版本的问题

github-actions bot added the pending This problem is yet to be addressed label Jul 9, 2024

hiyouga closed this as completed in 2f09520 Jul 9, 2024

hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels Jul 9, 2024

xtchen96 pushed a commit to xtchen96/LLaMA-Factory that referenced this issue Jul 17, 2024

fix hiyouga#4742

20f941b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DPO训练报错：AttributeError: 'CustomDPOTrainer' object has no attribute 'f_divergence_type' #4742

DPO训练报错：AttributeError: 'CustomDPOTrainer' object has no attribute 'f_divergence_type' #4742

brillianti commented Jul 9, 2024 •

edited

Loading

lzzzx666 commented Jul 10, 2024

hiyouga commented Jul 10, 2024

2481241414 commented Jul 12, 2024

aijianiula0601 commented Jul 23, 2024

lzzzx666 commented Jul 23, 2024

jianhai0527 commented Jul 24, 2024

lzzzx666 commented Jul 24, 2024

LiJianmin6706 commented Aug 30, 2024

lzzzx666 commented Aug 30, 2024

LiJianmin6706 commented Aug 30, 2024

lzzzx666 commented Aug 30, 2024

DPO训练报错：AttributeError: 'CustomDPOTrainer' object has no attribute 'f_divergence_type' #4742

DPO训练报错：AttributeError: 'CustomDPOTrainer' object has no attribute 'f_divergence_type' #4742

Comments

brillianti commented Jul 9, 2024 • edited Loading

Reminder

System Info

Reproduction

Expected behavior

Others

lzzzx666 commented Jul 10, 2024

hiyouga commented Jul 10, 2024

2481241414 commented Jul 12, 2024

aijianiula0601 commented Jul 23, 2024

lzzzx666 commented Jul 23, 2024

jianhai0527 commented Jul 24, 2024

lzzzx666 commented Jul 24, 2024

LiJianmin6706 commented Aug 30, 2024

lzzzx666 commented Aug 30, 2024

LiJianmin6706 commented Aug 30, 2024

lzzzx666 commented Aug 30, 2024

brillianti commented Jul 9, 2024 •

edited

Loading