Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DPO 训练时,prompt 与 answer 拼接问题,导致cutoff_length这一超参数无法对数据进行有效截断。 #4617

Closed
THZdyjy opened this issue Jun 29, 2024 · 2 comments
Labels
solved This problem has been already solved

Comments

@THZdyjy
Copy link

THZdyjy commented Jun 29, 2024

image
如上图所示,在源码中,在拼接 prompt 和 rejected 时,这里的 prompt 采用的是 chosen_prompt, 而不是 rejected_prompt
,这导致当设置了 cutoff_length=2048时,不能对 rejected 数据进行有效截断。
将代码修改后,如下图所示,能够根据cutoff_length对数据进行有效截断。
image

@github-actions github-actions bot added the pending This problem is yet to be addressed label Jun 29, 2024
@niravlg
Copy link

niravlg commented Jun 30, 2024

I believe this issue has been mentioned in
#4402

As far as I understand, the above suggested solution changes the prompt used for chosen and rejected responses in DPO which likely effect the training and results. Instead, I believe the implementation should follow from the DPO Trainer's implementation in -
https://github.com/huggingface/trl/blob/main/trl/trainer/dpo_trainer.py

PS - The above issue in not mentioned in my native language. I used ChatGPT to translate it to english. I apologize in advance of any confusion.

hiyouga added a commit that referenced this issue Jun 30, 2024
Deprecate reserved_label_len arg
@hiyouga
Copy link
Owner

hiyouga commented Jun 30, 2024

fixed

@hiyouga hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels Jun 30, 2024
@hiyouga hiyouga closed this as completed Jun 30, 2024
PrimaLuz pushed a commit to PrimaLuz/LLaMA-Factory that referenced this issue Jul 1, 2024
Deprecate reserved_label_len arg
xtchen96 pushed a commit to xtchen96/LLaMA-Factory that referenced this issue Jul 17, 2024
Deprecate reserved_label_len arg
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved This problem has been already solved
Projects
None yet
Development

No branches or pull requests

3 participants