-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
在进行ppo训练时发生下面错误 #3116
Labels
solved
This problem has been already solved
Comments
fixed |
--report_to none 这个参数去掉就可以了,train_web.py生成的命令也需要做对应的调整 |
新版代码应该已经彻底修复了。 |
tybalex
added a commit
to sanjay920/LLaMA-Factory
that referenced
this issue
Apr 10, 2024
* fix packages * Update wechat.jpg * Updated README with new information * Updated README with new information * Updated README with new information * Follow HF_ENDPOINT environment variable * fix hiyouga#2346 * fix hiyouga#2777 hiyouga#2895 * add orca_dpo_pairs dataset * support fsdp + qlora * update readme * update tool extractor * paper release * add citation * move file * Update README.md, fix the release date of the paper * Update README_zh.md, fix the release date of the paper * Update wechat.jpg * fix hiyouga#2941 * fix hiyouga#2928 * fix hiyouga#2936 * fix Llama lora merge crash * fix Llama lora merge crash * fix Llama lora merge crash * pass ruff check * tiny fix * Update requirements.txt * Update README_zh.md * release v0.6.0 * add arg check * Update README_zh.md * Update README.md * update readme * tiny fix * release v0.6.0 (real) * Update wechat.jpg * fix hiyouga#2961 * fix bug * fix hiyouga#2981 * fix ds optimizer * update trainers * fix hiyouga#3010 * update readme * fix hiyouga#2982 * add project * update readme * release v0.6.1 * Update wechat.jpg * fix pile datset hf hub url * upgrade gradio to 4.21.0 * support save args in webui hiyouga#2807 hiyouga#3046 some ideas are borrowed from @marko1616 * Fix Llama model save for full param train * fix blank line contains whitespace * tiny fix * support ORPO * support orpo in webui * update readme * use log1p in orpo loss huggingface/trl#1491 * fix plots * fix IPO and ORPO loss * fix ORPO loss * update webui * support infer 4bit model on GPUs hiyouga#3023 * fix hiyouga#3077 * add qwen1.5 moe * fix hiyouga#3083 * set dev version * Update SECURITY.md * fix hiyouga#3022 * add moe aux loss control hiyouga#3085 * simplify readme * update readme * update readme * update examples * update examples * add zh readme * update examples * update readme * update vllm example * Update wechat.jpg * fix hiyouga#3116 * fix resize vocab at inference hiyouga#3022 * fix requires for windows * fix bug in latest gradio * back to gradio 4.21 and fix chat * tiny fix * update examples * update readme * support Qwen1.5-32B * support Qwen1.5-32B * fix spell error * support hiyouga#3152 * rename template to breeze * rename template to breeze * add empty line * Update wechat.jpg * tiny fix * fix quant infer and qwen2moe * Pass additional_target to unsloth Fixes hiyouga#3200 * Update adapter.py * Update adapter.py * fix hiyouga#3225 --------- Co-authored-by: hiyouga <[email protected]> Co-authored-by: 刘一博 <[email protected]> Co-authored-by: khazic <[email protected]> Co-authored-by: SirlyDreamer <[email protected]> Co-authored-by: Sanjay Nadhavajhala <[email protected]> Co-authored-by: sanjay920 <[email protected]> Co-authored-by: 0xez <[email protected]> Co-authored-by: marko1616 <[email protected]> Co-authored-by: Remek Kinas <[email protected]> Co-authored-by: Tsumugii24 <[email protected]> Co-authored-by: li.yunhao <[email protected]> Co-authored-by: sliderSun <[email protected]> Co-authored-by: codingma <[email protected]> Co-authored-by: Erich Schubert <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Reminder
Reproduction
具体的训练参数如下:
python src/train_bash.py --stage ppo --do_train True --model_name_or_path Qwen/Qwen1.5-0.5B-Chat --adapter_name_or_path saves\Qwen1.5-0.5B-Chat\lora\20240403_qianwen_sft --finetuning_type lora --template qwen --dataset_dir data --dataset alpaca_gpt4_zh --cutoff_len 1024 --learning_rate 5e-05 --num_train_epochs 3.0 --max_samples 100000 --per_device_train_batch_size 2 --gradient_accumulation_steps 8 --lr_scheduler_type cosine --max_grad_norm 1.0 --logging_steps 5 --save_steps 100 --warmup_steps 0 --optim adamw_torch --report_to none --output_dir saves\Qwen1.5-0.5B-Chat\lora\train_2024-04-03-13-59-00 --fp16 True --lora_rank 8 --lora_alpha 16 --lora_dropout 0.1 --lora_target q_proj,v_proj --reward_model saves\Qwen1.5-0.5B-Chat\lora\20240403_qianwen_rm --reward_model_type lora --plot_loss True
Expected behavior
No response
System Info
No response
Others
No response
The text was updated successfully, but these errors were encountered: