Releases · hiyouga/LLaMA-Factory

25 Mar 15:50

hiyouga

v0.6.0

ba70aca

v0.6.0: Paper Release, GaLore and FSDP+QLoRA

We released our paper on arXiv! Thanks to all co-authors and AK's recommendation

New features

Support GaLore algorithm, allowing full-parameter learning of a 7B model using less than 24GB VRAM
Support FSDP+QLoRA that allows QLoRA fine-tuning of a 70B model on 2x24GB GPUs
Support LoRA+ algorithm for better LoRA fine-tuning by @qibaoyuan in #2830
LLaMA Factory 🤝 vLLM, enjoy 270% inference speed with --infer_backend vllm
Add Colab notebook for easily getting started
Support pushing fine-tuned models to Hugging Face Hub in web UI
Support apply_chat_template by adding a chat template to the tokenizer after fine-tuning
Add dockerize support by @S3Studio in #2743 #2849

New models

Base models
- OLMo (1B/7B)
- StarCoder2 (3B/7B/15B)
- Yi-9B
Instruct/Chat models
- OLMo-7B-Instruct

New datasets

Supervised fine-tuning datasets
- Cosmopedia (en)
Preference datasets
- Orca DPO (en)

Bug fix

Fix flash_attn in web UI by @cx2333-gt in #2730
Fix deepspeed runtime error in PPO by @stephen-nju in #2746
Fix readme ddp instruction by @khazic in #2903
Fix environment variable in datasets by @SirlyDreamer in #2905
Fix readme information by @0xez in #2919
Fix generation config validation by @marko1616 in #2945
Fix requirements by @rkinas in #2963
Fix bitsandbytes windows version by @Tsumugii24 in #2967
Fix #2346 #2642 #2649 #2732 #2735 #2756 #2766 #2775 #2777 #2782 #2798 #2802 #2803 #2817 #2895 #2928 #2936 #2941

Contributors

qibaoyuan, S3Studio, and 8 other contributors

Assets 2

2 Join discussion

28 Feb 17:01

hiyouga

v0.5.3

ece3b37

v0.5.3: DoRA and AWQ/AQLM QLoRA

New features

Support DoRA (Weight-Decomposed LoRA)
Support QLoRA for the AWQ/AQLM quantized models, now 2-bit QLoRA is feasible
Provide some example scripts in https://github.com/hiyouga/LLaMA-Factory/tree/main/examples

New models

Base models
- Gemma (2B/7B)
Instruct/Chat models
- Gemma-it (2B/7B)

Bug fix

Add flash-attn package for Windows user by @codemayq in #2514
Fix ppo trainer #1163 by @stephen-nju in #2525
Support atom models by @Rayrtfr in #2531
Support role in webui by @lungothrin in #2575
Bump accelerate to 0.27.2 and fix #2552 by @Katehuuh in #2608
Fix #2512 #2516 #2532 #2533 #2629

Contributors

lungothrin, codemayq, and 3 other contributors

Assets 2

20 Feb 07:32

hiyouga

v0.5.2

5bbec1c

v0.5.2: Block Expansion, Qwen1.5 Models

New features

Support block expansion in LLaMA Pro, see tests/llama_pro.py for usage
Add use_rslora option for the LoRA method

New models

Base models
- Qwen1.5 (0.5B/1.8B/4B/7B/14B/72B)
- DeepSeekMath-7B-Base
- DeepSeekCoder-7B-Base-v1.5
- Orion-14B-Base
Instruct/Chat models
- Qwen1.5-Chat (0.5B/1.8B/4B/7B/14B/72B)
- MiniCPM-2B-SFT/DPO
- DeepSeekMath-7B-Instruct
- DeepSeekCoder-7B-Instruct-v1.5
- Orion-14B-Chat
- Orion-14B-Long-Chat
- Orion-14B-RAG-Chat
- Orion-14B-Plugin-Chat

New datasets

Supervised fine-tuning datasets
- SlimOrca (en)
- Dolly (de)
- Dolphin (de)
- Airoboros (de)
Preference datasets
- Orca DPO (de)

Bug fix

Fix torch_dtype check in export model by @fenglui in #2262
Add Russian locale to LLaMA Board by @seoeaa in #2264
Remove manually set use_cache in export model by @yhyu13 in #2266
Fix DeepSpeed Zero3 training with MoE models by @A-Cepheus in #2283
Add a patch for full training of the Mixtral model using DeepSpeed Zero3 by @ftgreat in #2319
Fix bug in data pre-processing by @lxsyz in #2411
Add German sft and dpo datasets by @johannhartmann in #2423
Add version checking in test_toolcall.py by @mini-tiger in #2435
Enable parsing of SlimOrca dataset by @mnmueller in #2462
Add tags for models when pushing to hf hub by @younesbelkada in #2474
Fix #2189 #2268 #2282 #2320 #2338 #2376 #2388 #2394 #2397 #2404 #2412 #2420 #2421 #2436 #2438 #2471 #2481

Contributors

fenglui, johannhartmann, and 8 other contributors

Assets 2

20 Jan 18:37

hiyouga

v0.5.0

a0d59aa

v0.5.0: Agent Tuning, Unsloth Integration

Congratulations on 10k stars 🎉 Make LLM fine-tuning easier and faster together with LLaMA-Factory ✨

New features

Support agent tuning for most models, you can fine-tune any LLMs with --dataset glaive_toolcall for tool using #2226
Support function calling in both API and Web mode with fine-tuned models, same as the OpenAI's format
LLaMA Factory 🤝 Unsloth, enjoy 170% LoRA training speed with --use_unsloth, see benchmarking here
Supports fine-tuning models on MPS device #2090

New models

Base models
- Phi-2 (2.7B)
- InternLM2 (7B/20B)
- SOLAR-10.7B
- DeepseekMoE-16B-Base
- XVERSE-65B-2
Instruct/Chat models
- InternLM2-Chat (7B/20B)
- SOLAR-10.7B-Instruct
- DeepseekMoE-16B-Chat
- Yuan (2B/51B/102B)

New datasets

Supervised fine-tuning datasets
- deepctrl dataset
- Glaive function calling dataset v2

Core updates

Refactor data engine: clearer dataset alignment, easier templating and tool formatting
Refactor saving logic for models with value head #1789
Use ruff code formatter for stylish code

Bug fix

Bump transformers version to 4.36.2 by @ShaneTian in #1932
Fix requirements by @dasdristanta13 in #2117
Add Machine-Mindset project by @JessyTsui in #2163
Fix typo in readme file by @junuMoon in #2194
Support resize token embeddings with ZeRO3 by @liu-zichen in #2201
Fix #1073 #1462 #1617 #1735 #1742 #1789 #1821 #1875 #1895 #1900 #1908 #1907 #1909 #1923 #2014 #2067 #2081 #2090 #2098 #2125 #2127 #2147 #2161 #2164 #2183 #2195 #2249 #2260

Contributors

ShaneTian, JessyTsui, and 3 other contributors

Assets 2

16 Dec 13:48

hiyouga

v0.4.0

870426f

v0.4.0: Mixtral-8x7B, DPO-ftx, AutoGPTQ Integration

🚨🚨 Core refactor

Deprecate checkpoint_dir and use adapter_name_or_path instead
Replace resume_lora_training with create_new_adapter
Move the patches in model loading to llmtuner.model.patcher
Bump to Transformers 4.36.1 to adapt to the Mixtral models
Wide adaptation for FlashAttention2 (LLaMA, Falcon, Mistral)
Temporarily disable LongLoRA due to breaking changes, which will be supported later

The above changes were made by @hiyouga in #1864

New features

Add DPO-ftx: mixing fine-tuning gradients to DPO via the dpo_ftx argument, suggested by @lylcst in #1347 (comment)
Integrate AutoGPTQ into the model export via the export_quantization_bit and export_quantization_dataset arguments
Support loading datasets from ModelScope Hub by @tastelikefeet and @wangxingjun778 in #1802
Support resizing token embeddings with the noisy mean initialization by @hiyouga in a66186b
Support system column in both alpaca and sharegpt dataset formats

New models

Base models
- Mixtral-8x7B-v0.1
Instruct/Chat models
- Mixtral-8x7B-v0.1-instruct
- Mistral-7B-Instruct-v0.2
- XVERSE-65B-Chat
- Yi-6B-Chat

Bug fix

Improve logging for unknown arguments by @yhyu13 in #1868
Fix an overflow issue in LLaMA2 PPO training #1742
Fix #246 #1561 #1715 #1764 #1765 #1770 #1771 #1784 #1786 #1795 #1815 #1819 #1831

Contributors

wangxingjun778, hiyouga, and 3 other contributors

Assets 2

03 Dec 14:17

hiyouga

v0.3.3

438dea6

v0.3.3: ModelScope Integration, Reward Server

New features

Support loading pre-trained models from ModelScope Hub by @tastelikefeet in #1700
Support launching a reward model server in demo API via specifying --stage=rm in api_demo.py
Support using a reward model server in PPO training via specifying --reward_model_type api
Support adjusting the shard size of exported models via the export_size argument

New models

Base models
- DeepseekLLM-Base (7B/67B)
- Qwen (1.8B/72B)
Instruct/Chat models
- DeepseekLLM-Chat (7B/67B)
- Qwen-Chat (1.8B/72B)
- Yi-34B-Chat

New datasets

Supervised fine-tuning datasets
- Nectar dataset by @mlinmg in #1689
Preference datasets
- Nectar dataset by @mlinmg in #1689

Bug fix

Improve get_current_device by @billvsme in #1690
Improve web UI preview by @Samge0 in #1695
Fix #1543 #1597 #1657 #1658 #1659 #1668 #1682 #1696 #1699 #1703 #1707 #1710

Contributors

billvsme, Samge0, and 2 other contributors

Assets 2

21 Nov 05:41

hiyouga

v0.3.2

5085b00

v0.3.2: Patch release

New features

Support training GPTQ quantized model #729 #1481 #1545
Support resuming reward model training #1567

Bug fix

Change default PPO parameters by @hannlp in #1553
Fix ChatGLM2&3 templates #1453 #1480
Fix #1548 by @Outsider565 in #1544
Fix #1263 #1550 #1558

Contributors

hannlp and Outsider565

Assets 2

16 Nov 08:24

hiyouga

v0.3.0

c4facc0

v0.3.0: Full-Parameter RLHF

New features

Support full-parameter RLHF training (RM & PPO)
Refactor llmtuner core in #1525 by @hiyouga
Better LLaMA Board: full-parameter RLHF and demo mode

New models

Base models
- ChineseLLaMA-1.3B
- LingoWhale-8B
Instruct/Chat models
- ChineseAlpaca-1.3B
- Zephyr-7B-Alpha/Beta

Bug fix

Fix bugs in partial-parameter (freeze) tuning
Fix #224 #336 #931 #936 #1011 #1489 #1494 #1507 #1514

Contributors

hiyouga

Assets 2

13 Nov 15:16

hiyouga

v0.2.2

35cc1e2

v0.2.2: Patch release

Bug fix

Fix the OOM issue in PPO training by @mmbwf in #424
Fix fine-tuning arguments by @yyq in #1454
Refactor constants and evaluation by @hiyouga
Fix #1452 #1466 #1478

Contributors

yyq, hiyouga, and mmbwf

Assets 2

09 Nov 08:30

hiyouga

v0.2.1

b357265

v0.2.1: Variant Models, NEFTune Trick

New features

Support NEFTune trick for supervised fine-tuning by @anvie in #1252
Support loading dataset in the sharegpt format - read data/readme for details
Support generating multiple responses in demo API via the n parameter
Support caching the pre-processed dataset files via the cache_path argument
Better LLaMA Board (pagination, controls, etc.)
Support push_to_hub argument #1088

New models

Base models
- ChatGLM3-6B-Base
- Yi (6B/34B)
- Mistral-7B
- BlueLM-7B-Base
- Skywork-13B-Base
- XVERSE-65B
- Falcon-180B
- Deepseek-Coder-Base (1.3B/6.7B/33B)
Instruct/Chat models
- ChatGLM3-6B
- Mistral-7B-Instruct
- BlueLM-7B-Chat
- Zephyr-7B
- OpenChat-3.5
- Yayi (7B/13B)
- Deepseek-Coder-Instruct (1.3B/6.7B/33B)

New datasets

Pre-training datasets
- RedPajama V2
- Pile
Supervised fine-tuning datasets
- OpenPlatypus
- ShareGPT Hyperfiltered
- ShareGPT4
- UltraChat 200k
- AgentInstruct
- LMSYS Chat 1M
- Evol Instruct V2

Bug fix

Fix full-parameter DPO training #1383 #1422 (inspired by @mengban )
Fix tokenizer config by @lvzii in #1436
Fix #1197 #1215 #1217 #1218 #1228 #1232 #1285 #1287 #1290 #1316 #1325 #1349 #1356 #1365 #1411 #1418 #1438 #1439 #1446

Contributors

anvie, mengban, and lvzii

Assets 2

Releases: hiyouga/LLaMA-Factory

v0.6.0: Paper Release, GaLore and FSDP+QLoRA

We released our paper on arXiv! Thanks to all co-authors and AK's recommendation

New features

New models

New datasets

Bug fix

Contributors

v0.5.3: DoRA and AWQ/AQLM QLoRA

New features

New models

Bug fix

Contributors

v0.5.2: Block Expansion, Qwen1.5 Models

New features

New models

New datasets

Bug fix

Contributors

v0.5.0: Agent Tuning, Unsloth Integration

Congratulations on 10k stars 🎉 Make LLM fine-tuning easier and faster together with LLaMA-Factory ✨

New features

New models

New datasets

Core updates

Bug fix

Contributors

v0.4.0: Mixtral-8x7B, DPO-ftx, AutoGPTQ Integration

🚨🚨 Core refactor

New features

New models

Bug fix

Contributors

v0.3.3: ModelScope Integration, Reward Server

New features

New models

New datasets

Bug fix

Contributors

v0.3.2: Patch release

New features

Bug fix

Contributors

v0.3.0: Full-Parameter RLHF

New features

New models

Bug fix

Contributors

v0.2.2: Patch release

Bug fix

Contributors

v0.2.1: Variant Models, NEFTune Trick

New features

New models

New datasets

Bug fix

Contributors