Skip to content

Releases: hiyouga/LLaMA-Factory

v0.6.0: Paper Release, GaLore and FSDP+QLoRA

25 Mar 15:50
Compare
Choose a tag to compare

We released our paper on arXiv! Thanks to all co-authors and AK's recommendation

New features

  • Support GaLore algorithm, allowing full-parameter learning of a 7B model using less than 24GB VRAM
  • Support FSDP+QLoRA that allows QLoRA fine-tuning of a 70B model on 2x24GB GPUs
  • Support LoRA+ algorithm for better LoRA fine-tuning by @qibaoyuan in #2830
  • LLaMA Factory 🤝 vLLM, enjoy 270% inference speed with --infer_backend vllm
  • Add Colab notebook for easily getting started
  • Support pushing fine-tuned models to Hugging Face Hub in web UI
  • Support apply_chat_template by adding a chat template to the tokenizer after fine-tuning
  • Add dockerize support by @S3Studio in #2743 #2849

New models

  • Base models
    • OLMo (1B/7B)
    • StarCoder2 (3B/7B/15B)
    • Yi-9B
  • Instruct/Chat models
    • OLMo-7B-Instruct

New datasets

  • Supervised fine-tuning datasets
    • Cosmopedia (en)
  • Preference datasets
    • Orca DPO (en)

Bug fix

v0.5.3: DoRA and AWQ/AQLM QLoRA

28 Feb 17:01
Compare
Choose a tag to compare

New features

New models

  • Base models
    • Gemma (2B/7B)
  • Instruct/Chat models
    • Gemma-it (2B/7B)

Bug fix

v0.5.2: Block Expansion, Qwen1.5 Models

20 Feb 07:32
Compare
Choose a tag to compare

New features

  • Support block expansion in LLaMA Pro, see tests/llama_pro.py for usage
  • Add use_rslora option for the LoRA method

New models

  • Base models
    • Qwen1.5 (0.5B/1.8B/4B/7B/14B/72B)
    • DeepSeekMath-7B-Base
    • DeepSeekCoder-7B-Base-v1.5
    • Orion-14B-Base
  • Instruct/Chat models
    • Qwen1.5-Chat (0.5B/1.8B/4B/7B/14B/72B)
    • MiniCPM-2B-SFT/DPO
    • DeepSeekMath-7B-Instruct
    • DeepSeekCoder-7B-Instruct-v1.5
    • Orion-14B-Chat
    • Orion-14B-Long-Chat
    • Orion-14B-RAG-Chat
    • Orion-14B-Plugin-Chat

New datasets

  • Supervised fine-tuning datasets
    • SlimOrca (en)
    • Dolly (de)
    • Dolphin (de)
    • Airoboros (de)
  • Preference datasets
    • Orca DPO (de)

Bug fix

v0.5.0: Agent Tuning, Unsloth Integration

20 Jan 18:37
Compare
Choose a tag to compare

Congratulations on 10k stars 🎉 Make LLM fine-tuning easier and faster together with LLaMA-Factory ✨

New features

  • Support agent tuning for most models, you can fine-tune any LLMs with --dataset glaive_toolcall for tool using #2226
  • Support function calling in both API and Web mode with fine-tuned models, same as the OpenAI's format
  • LLaMA Factory 🤝 Unsloth, enjoy 170% LoRA training speed with --use_unsloth, see benchmarking here
  • Supports fine-tuning models on MPS device #2090

New models

  • Base models
    • Phi-2 (2.7B)
    • InternLM2 (7B/20B)
    • SOLAR-10.7B
    • DeepseekMoE-16B-Base
    • XVERSE-65B-2
  • Instruct/Chat models
    • InternLM2-Chat (7B/20B)
    • SOLAR-10.7B-Instruct
    • DeepseekMoE-16B-Chat
    • Yuan (2B/51B/102B)

New datasets

  • Supervised fine-tuning datasets
    • deepctrl dataset
    • Glaive function calling dataset v2

Core updates

  • Refactor data engine: clearer dataset alignment, easier templating and tool formatting
  • Refactor saving logic for models with value head #1789
  • Use ruff code formatter for stylish code

Bug fix

v0.4.0: Mixtral-8x7B, DPO-ftx, AutoGPTQ Integration

16 Dec 13:48
Compare
Choose a tag to compare

🚨🚨 Core refactor

  • Deprecate checkpoint_dir and use adapter_name_or_path instead
  • Replace resume_lora_training with create_new_adapter
  • Move the patches in model loading to llmtuner.model.patcher
  • Bump to Transformers 4.36.1 to adapt to the Mixtral models
  • Wide adaptation for FlashAttention2 (LLaMA, Falcon, Mistral)
  • Temporarily disable LongLoRA due to breaking changes, which will be supported later

The above changes were made by @hiyouga in #1864

New features

  • Add DPO-ftx: mixing fine-tuning gradients to DPO via the dpo_ftx argument, suggested by @lylcst in #1347 (comment)
  • Integrate AutoGPTQ into the model export via the export_quantization_bit and export_quantization_dataset arguments
  • Support loading datasets from ModelScope Hub by @tastelikefeet and @wangxingjun778 in #1802
  • Support resizing token embeddings with the noisy mean initialization by @hiyouga in a66186b
  • Support system column in both alpaca and sharegpt dataset formats

New models

  • Base models
    • Mixtral-8x7B-v0.1
  • Instruct/Chat models
    • Mixtral-8x7B-v0.1-instruct
    • Mistral-7B-Instruct-v0.2
    • XVERSE-65B-Chat
    • Yi-6B-Chat

Bug fix

v0.3.3: ModelScope Integration, Reward Server

03 Dec 14:17
Compare
Choose a tag to compare

New features

  • Support loading pre-trained models from ModelScope Hub by @tastelikefeet in #1700
  • Support launching a reward model server in demo API via specifying --stage=rm in api_demo.py
  • Support using a reward model server in PPO training via specifying --reward_model_type api
  • Support adjusting the shard size of exported models via the export_size argument

New models

  • Base models
    • DeepseekLLM-Base (7B/67B)
    • Qwen (1.8B/72B)
  • Instruct/Chat models
    • DeepseekLLM-Chat (7B/67B)
    • Qwen-Chat (1.8B/72B)
    • Yi-34B-Chat

New datasets

  • Supervised fine-tuning datasets
  • Preference datasets

Bug fix

v0.3.2: Patch release

21 Nov 05:41
Compare
Choose a tag to compare

New features

  • Support training GPTQ quantized model #729 #1481 #1545
  • Support resuming reward model training #1567

Bug fix

v0.3.0: Full-Parameter RLHF

16 Nov 08:24
Compare
Choose a tag to compare

New features

  • Support full-parameter RLHF training (RM & PPO)
  • Refactor llmtuner core in #1525 by @hiyouga
  • Better LLaMA Board: full-parameter RLHF and demo mode

New models

  • Base models
    • ChineseLLaMA-1.3B
    • LingoWhale-8B
  • Instruct/Chat models
    • ChineseAlpaca-1.3B
    • Zephyr-7B-Alpha/Beta

Bug fix

v0.2.2: Patch release

13 Nov 15:16
Compare
Choose a tag to compare

Bug fix

v0.2.1: Variant Models, NEFTune Trick

09 Nov 08:30
Compare
Choose a tag to compare

New features

  • Support NEFTune trick for supervised fine-tuning by @anvie in #1252
  • Support loading dataset in the sharegpt format - read data/readme for details
  • Support generating multiple responses in demo API via the n parameter
  • Support caching the pre-processed dataset files via the cache_path argument
  • Better LLaMA Board (pagination, controls, etc.)
  • Support push_to_hub argument #1088

New models

  • Base models
    • ChatGLM3-6B-Base
    • Yi (6B/34B)
    • Mistral-7B
    • BlueLM-7B-Base
    • Skywork-13B-Base
    • XVERSE-65B
    • Falcon-180B
    • Deepseek-Coder-Base (1.3B/6.7B/33B)
  • Instruct/Chat models
    • ChatGLM3-6B
    • Mistral-7B-Instruct
    • BlueLM-7B-Chat
    • Zephyr-7B
    • OpenChat-3.5
    • Yayi (7B/13B)
    • Deepseek-Coder-Instruct (1.3B/6.7B/33B)

New datasets

  • Pre-training datasets
    • RedPajama V2
    • Pile
  • Supervised fine-tuning datasets
    • OpenPlatypus
    • ShareGPT Hyperfiltered
    • ShareGPT4
    • UltraChat 200k
    • AgentInstruct
    • LMSYS Chat 1M
    • Evol Instruct V2

Bug fix