Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Train on the last turn only truncate bug #5115

Merged
merged 2 commits into from
Aug 9, 2024
Merged

fix: Train on the last turn only truncate bug #5115

merged 2 commits into from
Aug 9, 2024

Conversation

YeQiuO
Copy link
Contributor

@YeQiuO YeQiuO commented Aug 8, 2024

What does this PR do?

#4878 提出的 ”仅最后一轮参与训练“ 可以满足 Distillation、Reflective Chain 等场景的需求,但其实现方法存在一个小问题:

当对话总长度超过 cutoff_len,会退出遍历 encoded_pairs 的循环(截断),则最重要的最后一段对话无法加入训练数据中,且 label 全为 IGNORE_INDEX,该条数据无意义。

针对截断问题,即在保留最后一段对话的前提下,可行的解决方案有:

  1. 反向遍历 encoded_pairs ,新增元素从头部插入 input_idslabels,其余逻辑不变;
  2. 最大化截断后所保留的对话轮次,即删除较长的单轮对话;

PR 了采取第一种方法,以最小化改动对原代码的侵入性。

Fixes #4684

Before submitting

@hiyouga hiyouga added the pending This problem is yet to be addressed label Aug 9, 2024
Copy link
Owner

@hiyouga hiyouga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@hiyouga hiyouga merged commit 51542cb into hiyouga:main Aug 9, 2024
1 check passed
@hiyouga hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels Aug 9, 2024
hiyouga added a commit that referenced this pull request Aug 9, 2024
@Quarkstar
Copy link

您好!请问为什么选择反向遍历对话呢?我有点担心这会影响推理任务,特别是在CoT时,后一步通常依赖前一步的结果。这样反向训练会不会影响训练效果呢?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved This problem has been already solved
Projects
None yet
Development

Successfully merging this pull request may close these issues.

疑问:历史消息在训练时可以只作为上文不参与模型的预测吗?~
3 participants