全量微调ChatGLM3训练效果不如官方代码仓的效果，请帮忙分析 #2991

charliedream1 · 2024-03-26T13:55:10Z

Reminder

I have read the README and searched the existing issues.

Reproduction

我在使用全量微调时，训练效果总是和chatglm官方代码训出来的有不少差距，我详细对了参数和底层代码，一直没有找到原因。而且同样设置训2轮，官方的代码比l咱们的代码仓训练慢快一倍，但GPU功耗和利用率要高一倍，但是loss的收敛和最终效果要明显好于咱们的。试着更换了好几个版本的transformer也没有改善。

用咱们的代码，即便增加训练轮数，对比chatglm3官方代码，总感觉没有完全学进去，学的很浅。

请帮忙分析一下什么原因导致的，非常感谢。

deepspeed --num_gpus 4 ../../src/train_bash.py
--deepspeed ../deepspeed/ds_z3_config.json
--stage sft
--do_train
--model_name_or_path chatglm3
--dataset train
--dataset_dir ../../data
--template default
--finetuning_type full
--output_dir ../../saves/full/sft
--overwrite_cache
--overwrite_output_dir
--cutoff_len 4096
--preprocessing_num_workers 16
--per_device_train_batch_size 8
--per_device_eval_batch_size 8
--gradient_accumulation_steps 2
--lr_scheduler_type cosine
--logging_steps 10
--save_steps 100
--eval_steps 100
--evaluation_strategy steps
--learning_rate 5e-5
--num_train_epochs 2.0
--val_size 0.1
--ddp_timeout 1800000
--plot_loss
--fp16

Expected behavior

No response

System Info

No response

Others

No response

hiyouga · 2024-03-26T14:13:02Z

你 template 都没选对，应该使用 chatglm3

charliedream1 · 2024-03-26T14:16:46Z

我选的是chatglm3，issue里是我粘贴错误，因为我的启动脚本里有很多路径和变量名，我就从咱们的代码仓里拷的

hiyouga · 2024-03-26T14:25:50Z

建议检查一下两者环境和配置是否一样

charliedream1 · 2024-03-26T14:28:22Z

仔细对过，一直没找到原因

…

---原始邮件--- 发件人: ***@***.***> 发送时间: 2024年3月26日(周二) 晚上10:26 收件人: ***@***.***>; 抄送: "Optimus ***@***.******@***.***>; 主题: Re: [hiyouga/LLaMA-Factory] 全量微调ChatGLM3训练效果不如官方代码仓的效果，请帮忙分析 (Issue #2991) 建议检查一下两者环境和配置是否一样 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>

hiyouga · 2024-03-26T14:30:52Z

你试一下使用这个版本的代码呢？
https://github.com/hiyouga/LLaMA-Factory/tree/v0.5.3

charliedream1 · 2024-03-26T14:31:58Z

好的

…

---原始邮件--- 发件人: ***@***.***> 发送时间: 2024年3月26日(周二) 晚上10:31 收件人: ***@***.***>; 抄送: "Optimus ***@***.******@***.***>; 主题: Re: [hiyouga/LLaMA-Factory] 全量微调ChatGLM3训练效果不如官方代码仓的效果，请帮忙分析 (Issue #2991) 你试一下使用这个版本的代码呢？ https://github.com/hiyouga/LLaMA-Factory/tree/v0.5.3 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>

hiyouga · 2024-03-26T14:41:37Z

我们好像定位到错误原因了，是最近的更新中引入了 bug

hiyouga · 2024-03-26T15:42:18Z

请更新至最新版本 3bcd41b 并且重试

charliedream1 · 2024-03-26T16:13:59Z

好，谢谢

…

---原始邮件--- 发件人: ***@***.***> 发送时间: 2024年3月26日(周二) 晚上11:42 收件人: ***@***.***>; 抄送: "Optimus ***@***.******@***.***>; 主题: Re: [hiyouga/LLaMA-Factory] 全量微调ChatGLM3训练效果不如官方代码仓的效果，请帮忙分析 (Issue #2991) 请更新至最新版本 3bcd41b 并且重试 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>

charliedream1 changed the title ~~全量微调ChatGLM3训练效果不如官方代码仓的效果~~ 全量微调ChatGLM3训练效果不如官方代码仓的效果，请帮忙分析 Mar 26, 2024

hiyouga added the solved This problem has been already solved label Mar 26, 2024

hiyouga closed this as completed Mar 26, 2024

hiyouga reopened this Mar 26, 2024

hiyouga referenced this issue Mar 26, 2024

fix ds optimizer

3bcd41b

hiyouga closed this as completed Mar 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

全量微调ChatGLM3训练效果不如官方代码仓的效果，请帮忙分析 #2991

全量微调ChatGLM3训练效果不如官方代码仓的效果，请帮忙分析 #2991

charliedream1 commented Mar 26, 2024 •

edited

Loading

hiyouga commented Mar 26, 2024

charliedream1 commented Mar 26, 2024

hiyouga commented Mar 26, 2024

charliedream1 commented Mar 26, 2024 via email

hiyouga commented Mar 26, 2024

charliedream1 commented Mar 26, 2024 via email

hiyouga commented Mar 26, 2024

hiyouga commented Mar 26, 2024

charliedream1 commented Mar 26, 2024 via email

全量微调ChatGLM3训练效果不如官方代码仓的效果，请帮忙分析 #2991

全量微调ChatGLM3训练效果不如官方代码仓的效果，请帮忙分析 #2991

Comments

charliedream1 commented Mar 26, 2024 • edited Loading

Reminder

Reproduction

Expected behavior

System Info

Others

hiyouga commented Mar 26, 2024

charliedream1 commented Mar 26, 2024

hiyouga commented Mar 26, 2024

charliedream1 commented Mar 26, 2024 via email

hiyouga commented Mar 26, 2024

charliedream1 commented Mar 26, 2024 via email

hiyouga commented Mar 26, 2024

hiyouga commented Mar 26, 2024

charliedream1 commented Mar 26, 2024 via email

charliedream1 commented Mar 26, 2024 •

edited

Loading