fix: `Train on the last turn only` truncate bug #5115

YeQiuO · 2024-08-08T02:45:09Z

What does this PR do?

#4878 提出的 ”仅最后一轮参与训练“ 可以满足 Distillation、Reflective Chain 等场景的需求，但其实现方法存在一个小问题：

当对话总长度超过 cutoff_len，会退出遍历 encoded_pairs 的循环（截断），则最重要的最后一段对话无法加入训练数据中，且 label 全为 IGNORE_INDEX，该条数据无意义。

针对截断问题，即在保留最后一段对话的前提下，可行的解决方案有：

PR 了采取第一种方法，以最小化改动对原代码的侵入性。

Fixes #4684

hiyouga

LGTM!

Quarkstar · 2024-10-15T08:36:09Z

您好！请问为什么选择反向遍历对话呢？我有点担心这会影响推理任务，特别是在CoT时，后一步通常依赖前一步的结果。这样反向训练会不会影响训练效果呢？

YeQiuO added 2 commits August 8, 2024 10:09

fix mask_history tiny bug

b5ca86c

mask_history args verify valid

2fa1e0b

hiyouga added the pending This problem is yet to be addressed label Aug 9, 2024

YeQiuO temporarily deployed to tests August 9, 2024 09:53 — with GitHub Actions Inactive

hiyouga approved these changes Aug 9, 2024

View reviewed changes

hiyouga merged commit 51542cb into hiyouga:main Aug 9, 2024
1 check passed

hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels Aug 9, 2024

hiyouga added a commit that referenced this pull request Aug 9, 2024

follow #5115

c87023d

YeQiuO mentioned this pull request Aug 13, 2024

自定义多轮对话数据集，只学习最后一轮对话 #5165

Closed

1 task