Releases: hiyouga/LLaMA-Factory
Releases · hiyouga/LLaMA-Factory
v0.6.0: Paper Release, GaLore and FSDP+QLoRA
We released our paper on arXiv! Thanks to all co-authors and AK's recommendation
New features
- Support GaLore algorithm, allowing full-parameter learning of a 7B model using less than 24GB VRAM
- Support FSDP+QLoRA that allows QLoRA fine-tuning of a 70B model on 2x24GB GPUs
- Support LoRA+ algorithm for better LoRA fine-tuning by @qibaoyuan in #2830
- LLaMA Factory 🤝 vLLM, enjoy 270% inference speed with
--infer_backend vllm
- Add Colab notebook for easily getting started
- Support pushing fine-tuned models to Hugging Face Hub in web UI
- Support
apply_chat_template
by adding a chat template to the tokenizer after fine-tuning - Add dockerize support by @S3Studio in #2743 #2849
New models
- Base models
- OLMo (1B/7B)
- StarCoder2 (3B/7B/15B)
- Yi-9B
- Instruct/Chat models
- OLMo-7B-Instruct
New datasets
- Supervised fine-tuning datasets
- Cosmopedia (en)
- Preference datasets
- Orca DPO (en)
Bug fix
- Fix flash_attn in web UI by @cx2333-gt in #2730
- Fix deepspeed runtime error in PPO by @stephen-nju in #2746
- Fix readme ddp instruction by @khazic in #2903
- Fix environment variable in datasets by @SirlyDreamer in #2905
- Fix readme information by @0xez in #2919
- Fix generation config validation by @marko1616 in #2945
- Fix requirements by @rkinas in #2963
- Fix bitsandbytes windows version by @Tsumugii24 in #2967
- Fix #2346 #2642 #2649 #2732 #2735 #2756 #2766 #2775 #2777 #2782 #2798 #2802 #2803 #2817 #2895 #2928 #2936 #2941
v0.5.3: DoRA and AWQ/AQLM QLoRA
New features
- Support DoRA (Weight-Decomposed LoRA)
- Support QLoRA for the AWQ/AQLM quantized models, now 2-bit QLoRA is feasible
- Provide some example scripts in https://github.com/hiyouga/LLaMA-Factory/tree/main/examples
New models
- Base models
- Gemma (2B/7B)
- Instruct/Chat models
- Gemma-it (2B/7B)
Bug fix
v0.5.2: Block Expansion, Qwen1.5 Models
New features
- Support block expansion in LLaMA Pro, see
tests/llama_pro.py
for usage - Add
use_rslora
option for the LoRA method
New models
- Base models
- Qwen1.5 (0.5B/1.8B/4B/7B/14B/72B)
- DeepSeekMath-7B-Base
- DeepSeekCoder-7B-Base-v1.5
- Orion-14B-Base
- Instruct/Chat models
- Qwen1.5-Chat (0.5B/1.8B/4B/7B/14B/72B)
- MiniCPM-2B-SFT/DPO
- DeepSeekMath-7B-Instruct
- DeepSeekCoder-7B-Instruct-v1.5
- Orion-14B-Chat
- Orion-14B-Long-Chat
- Orion-14B-RAG-Chat
- Orion-14B-Plugin-Chat
New datasets
- Supervised fine-tuning datasets
- SlimOrca (en)
- Dolly (de)
- Dolphin (de)
- Airoboros (de)
- Preference datasets
- Orca DPO (de)
Bug fix
- Fix
torch_dtype
check in export model by @fenglui in #2262 - Add Russian locale to LLaMA Board by @seoeaa in #2264
- Remove manually set
use_cache
in export model by @yhyu13 in #2266 - Fix DeepSpeed Zero3 training with MoE models by @A-Cepheus in #2283
- Add a patch for full training of the Mixtral model using DeepSpeed Zero3 by @ftgreat in #2319
- Fix bug in data pre-processing by @lxsyz in #2411
- Add German sft and dpo datasets by @johannhartmann in #2423
- Add version checking in
test_toolcall.py
by @mini-tiger in #2435 - Enable parsing of SlimOrca dataset by @mnmueller in #2462
- Add tags for models when pushing to hf hub by @younesbelkada in #2474
- Fix #2189 #2268 #2282 #2320 #2338 #2376 #2388 #2394 #2397 #2404 #2412 #2420 #2421 #2436 #2438 #2471 #2481
v0.5.0: Agent Tuning, Unsloth Integration
Congratulations on 10k stars 🎉 Make LLM fine-tuning easier and faster together with LLaMA-Factory ✨
New features
- Support agent tuning for most models, you can fine-tune any LLMs with
--dataset glaive_toolcall
for tool using #2226 - Support function calling in both API and Web mode with fine-tuned models, same as the OpenAI's format
- LLaMA Factory 🤝 Unsloth, enjoy 170% LoRA training speed with
--use_unsloth
, see benchmarking here - Supports fine-tuning models on MPS device #2090
New models
- Base models
- Phi-2 (2.7B)
- InternLM2 (7B/20B)
- SOLAR-10.7B
- DeepseekMoE-16B-Base
- XVERSE-65B-2
- Instruct/Chat models
- InternLM2-Chat (7B/20B)
- SOLAR-10.7B-Instruct
- DeepseekMoE-16B-Chat
- Yuan (2B/51B/102B)
New datasets
- Supervised fine-tuning datasets
- deepctrl dataset
- Glaive function calling dataset v2
Core updates
- Refactor data engine: clearer dataset alignment, easier templating and tool formatting
- Refactor saving logic for models with value head #1789
- Use ruff code formatter for stylish code
Bug fix
- Bump transformers version to 4.36.2 by @ShaneTian in #1932
- Fix requirements by @dasdristanta13 in #2117
- Add Machine-Mindset project by @JessyTsui in #2163
- Fix typo in readme file by @junuMoon in #2194
- Support resize token embeddings with ZeRO3 by @liu-zichen in #2201
- Fix #1073 #1462 #1617 #1735 #1742 #1789 #1821 #1875 #1895 #1900 #1908 #1907 #1909 #1923 #2014 #2067 #2081 #2090 #2098 #2125 #2127 #2147 #2161 #2164 #2183 #2195 #2249 #2260
v0.4.0: Mixtral-8x7B, DPO-ftx, AutoGPTQ Integration
🚨🚨 Core refactor
- Deprecate
checkpoint_dir
and useadapter_name_or_path
instead - Replace
resume_lora_training
withcreate_new_adapter
- Move the patches in model loading to
llmtuner.model.patcher
- Bump to Transformers 4.36.1 to adapt to the Mixtral models
- Wide adaptation for FlashAttention2 (LLaMA, Falcon, Mistral)
- Temporarily disable LongLoRA due to breaking changes, which will be supported later
The above changes were made by @hiyouga in #1864
New features
- Add DPO-ftx: mixing fine-tuning gradients to DPO via the
dpo_ftx
argument, suggested by @lylcst in #1347 (comment) - Integrate AutoGPTQ into the model export via the
export_quantization_bit
andexport_quantization_dataset
arguments - Support loading datasets from ModelScope Hub by @tastelikefeet and @wangxingjun778 in #1802
- Support resizing token embeddings with the noisy mean initialization by @hiyouga in a66186b
- Support system column in both alpaca and sharegpt dataset formats
New models
- Base models
- Mixtral-8x7B-v0.1
- Instruct/Chat models
- Mixtral-8x7B-v0.1-instruct
- Mistral-7B-Instruct-v0.2
- XVERSE-65B-Chat
- Yi-6B-Chat
Bug fix
v0.3.3: ModelScope Integration, Reward Server
New features
- Support loading pre-trained models from ModelScope Hub by @tastelikefeet in #1700
- Support launching a reward model server in demo API via specifying
--stage=rm
inapi_demo.py
- Support using a reward model server in PPO training via specifying
--reward_model_type api
- Support adjusting the shard size of exported models via the
export_size
argument
New models
- Base models
- DeepseekLLM-Base (7B/67B)
- Qwen (1.8B/72B)
- Instruct/Chat models
- DeepseekLLM-Chat (7B/67B)
- Qwen-Chat (1.8B/72B)
- Yi-34B-Chat
New datasets
- Supervised fine-tuning datasets
- Preference datasets
Bug fix
v0.3.2: Patch release
v0.3.0: Full-Parameter RLHF
v0.2.2: Patch release
v0.2.1: Variant Models, NEFTune Trick
New features
- Support NEFTune trick for supervised fine-tuning by @anvie in #1252
- Support loading dataset in the sharegpt format - read data/readme for details
- Support generating multiple responses in demo API via the
n
parameter - Support caching the pre-processed dataset files via the
cache_path
argument - Better LLaMA Board (pagination, controls, etc.)
- Support
push_to_hub
argument #1088
New models
- Base models
- ChatGLM3-6B-Base
- Yi (6B/34B)
- Mistral-7B
- BlueLM-7B-Base
- Skywork-13B-Base
- XVERSE-65B
- Falcon-180B
- Deepseek-Coder-Base (1.3B/6.7B/33B)
- Instruct/Chat models
- ChatGLM3-6B
- Mistral-7B-Instruct
- BlueLM-7B-Chat
- Zephyr-7B
- OpenChat-3.5
- Yayi (7B/13B)
- Deepseek-Coder-Instruct (1.3B/6.7B/33B)
New datasets
- Pre-training datasets
- RedPajama V2
- Pile
- Supervised fine-tuning datasets
- OpenPlatypus
- ShareGPT Hyperfiltered
- ShareGPT4
- UltraChat 200k
- AgentInstruct
- LMSYS Chat 1M
- Evol Instruct V2