Releases · lm-sys/FastChat

11 Feb 15:40

merrymercy

v0.2.36

b21d0f7

Release v0.2.36 Latest

Latest

Highlights

Added SGLang worker for vision language models, lower latency and higher throughput #2928
Vision langauge WebUI #2960
OpenAI-compatible API server now supports image input #2928
Added LightLLM worker for higher throughput https://github.com/lm-sys/FastChat/blob/main/docs/lightllm_integration.md
Added Apple MLX worker #2940

What's Changed

fix specify local path issue use model from www.modelscope.cn by @liuyhwangyh in #2934
support openai embedding for topic clustering by @CodingWithTim in #2729
Remove duplicate API endpoint by @surak in #2949
Update Hermes Mixtral by @teknium1 in #2938
Enablement of REST API Usage within Google Colab Free Tier by @ggcr in #2940
Create a new worker implementation for Apple MLX by @aliasaria in #2937
feat: support Model Yuan2.0, a new generation Fundamental Large Language Model developed by IEIT System by @cauwulixuan in #2936
Fix the pooling method of BGE embedding model by @staoxiao in #2926
SGLang Worker by @BabyChouSr in #2928
Update mlx_worker to be async by @aliasaria in #2958
Integrate LightLLM into serve worker by @zeyugao in #2888
Copy button by @surak in #2963
feat: train with template by @congchan in #2951
fix content maybe a str by @zhouzaida in #2968
Adding download folder information in README by @dheeraj-326 in #2972
use cl100k_base as the default tiktoken encoding by @bjwswang in #2974
Update README.md by @merrymercy in #2975
Fix tokenizer for vllm worker by @Michaelvll in #2984
update yuan2.0 generation by @wangpengfei1013 in #2989
fix: tokenization mismatch when training with different templates by @congchan in #2996
fix: inconsistent tokenization by llama tokenizer by @congchan in #3006
Fix type hint for play_a_match_single by @MonkeyLeeT in #3008
code update by @infwinston in #2997
Update model_support.md by @infwinston in #3016
Update lightllm_integration.md by @eltociear in #3014
Upgrade gradio to 4.17 by @infwinston in #3027
Update MLX integration to use new generate_step function signature by @aliasaria in #3021
Update readme by @merrymercy in #3028
Update gradio version in pyproject.toml and fix a bug by @merrymercy in #3029
Update gradio demo and API model providers by @merrymercy in #3030
Gradio Web Server for Multimodal Models by @BabyChouSr in #2960
Migrate the gradio server to openai v1 by @merrymercy in #3032
Update version to 0.2.36 by @merrymercy in #3033

New Contributors

@teknium1 made their first contribution in #2938
@ggcr made their first contribution in #2940
@aliasaria made their first contribution in #2937
@cauwulixuan made their first contribution in #2936
@staoxiao made their first contribution in #2926
@zhouzaida made their first contribution in #2968
@dheeraj-326 made their first contribution in #2972
@bjwswang made their first contribution in #2974
@MonkeyLeeT made their first contribution in #3008

Full Changelog: v0.2.35...v0.2.36

Contributors

aliasaria, surak, and 18 other contributors

Assets 2

17 Jan 07:51

merrymercy

v0.2.35

bb8aae5

Release v0.2.35

What's Changed

add dolphin by @infwinston in #2794
Fix tiny typo by @bofenghuang in #2805
Add instructions for evaluating on MT bench using vLLM by @iojw in #2770
fix missing op | for py3.8 by @dumpmemory in #2800
Add SOLAR-10.7b Instruct Model by @BabyChouSr in #2826
Update README.md by @eltociear in #2852
fix: 'compeletion' typo by @congchan in #2847
Add Tunnelmole as an open source alternative to ngrok and include usage instructions by @robbie-cahill in #2846
Add support for CatPPT by @rishiraj in #2840
Add functionality to ping AI2 InferD endpoints for tulu 2 by @natolambert in #2832
add download models from www.modelscope.cn by @liuyhwangyh in #2830
Fix conv_template of chinese alpaca 2 by @zollty in #2812
add bagel model adapter by @jondurbin in #2814
add root_path argument to gradio web server. by @stephanbertl in #2807
Import accelerate locally to avoid it as a strong dependency by @chiragjn in #2820
Replace dict merge with unpacking for compatibility of 3.8 in vLLM worker by @rudeigerc in #2824
Format code by @merrymercy in #2854
Openai API migrate by @andy-yang-1 in #2765
Add new models (Perplexity, gemini) & Separate GPT versions by @merrymercy in #2856
Clean error messages by @merrymercy in #2857
Update docs by @Ying1123 in #2858
Modify doc description by @zhangsibo1129 in #2859
Fix the problem of not using the decoding method corresponding to the base model in peft mode by @Jingsong-Yan in #2865
update a new sota model on MT-Bench which touch an 8.8 scores. by @xiechengmude in #2864
NPU needs to be initialized when starting a new process by @jq460494839 in #2843
Fix the problem with "vllm + chatglm3" (#2845) by @yaofeng in #2876
Update token spacing for mistral conversation.py by @thavens in #2872
check if hm in models before deleting to avoid errors by @joshua-ne in #2870
Add TinyLlama by @Gk-rohan in #2889
Fix bug that model doesn't automatically switch peft adapter by @Jingsong-Yan in #2884
Update web server commands by @merrymercy in #2869
fix the tokenize process and prompt template of chatglm3 by @WHDY in #2883
Add Notus support by @gabrielmbmb in #2813
feat: support anthropic api with api_dict by @congchan in #2879
Update model_adapter.py by @thavens in #2895
leaderboard code update by @infwinston in #2867
fix: change order of SEQUENCE_LENGTH_KEYS by @congchan in #2925
fix baichuan:apply_prompt_template call args error by @Force1ess in #2921
Fix a typo in openai_api_server.py by @jklj077 in #2905
feat: use variables OPENAI_MODEL_LIST by @congchan in #2907
Add TenyxChat-7B-v1 model by @sarath-shekkizhar in #2901
add support for iei yuan2.0 (https://huggingface.co/IEITYuan) by @wangpengfei1013 in #2919
nous-hermes-2-mixtral-dpo by @152334H in #2922
Bump the version to 0.2.35 by @merrymercy in #2927

New Contributors

@dumpmemory made their first contribution in #2800
@robbie-cahill made their first contribution in #2846
@rishiraj made their first contribution in #2840
@natolambert made their first contribution in #2832
@liuyhwangyh made their first contribution in #2830
@stephanbertl made their first contribution in #2807
@chiragjn made their first contribution in #2820
@rudeigerc made their first contribution in #2824
@jq460494839 made their first contribution in #2843
@yaofeng made their first contribution in #2876
@thavens made their first contribution in #2872
@joshua-ne made their first contribution in #2870
@WHDY made their first contribution in #2883
@gabrielmbmb made their first contribution in #2813
@jklj077 made their first contribution in #2905
@sarath-shekkizhar made their first contribution in #2901
@wangpengfei1013 made their first contribution in #2919

Full Changelog: v0.2.34...v0.2.35

Contributors

yaofeng, infwinston, and 32 other contributors

Assets 2

09 Dec 16:38

merrymercy

v0.2.34

a7eb750

Release v0.2.34

What's Changed

fix tokenizer.pad_token attribute error by @wangshuai09 in #2710
support stable-vicuna model by @hi-jin in #2696
Exllama cache 8bit by @mjkaye in #2719
Add Yi support by @infwinston in #2723
Add Hermes 2.5 [fixed] by @152334H in #2725
Fix Hermes2Adapter by @lewtun in #2727
Fix YiAdapter by @Jingsong-Yan in #2730
add trust_remote_code argument by @wangshuai09 in #2715
Add revision arg to MT Bench answer generation by @lewtun in #2728
Fix MPS backend 'index out of range' error by @suquark in #2737
add starling support by @infwinston in #2738
Add deepseek chat by @BabyChouSr in #2760
a convenient script for spinning up the API with Model Workers by @ckgresla in #2790
Prevent returning partial stop string in vllm worker by @pandada8 in #2780
Update UI and new models by @infwinston in #2762
Support MetaMath by @iojw in #2748
Use common logging code in the OpenAI API server by @geekoftheweek in #2758
Show how to turn on experiment tracking for fine-tuning by @morganmcg1 in #2742
Support xDAN-L1-Chat Model by @xiechengmude in #2732
Update the version to 0.2.34 by @merrymercy in #2793

New Contributors

@mjkaye made their first contribution in #2719
@152334H made their first contribution in #2725
@Jingsong-Yan made their first contribution in #2730
@ckgresla made their first contribution in #2790
@pandada8 made their first contribution in #2780
@iojw made their first contribution in #2748
@geekoftheweek made their first contribution in #2758
@morganmcg1 made their first contribution in #2742
@xiechengmude made their first contribution in #2732

Full Changelog: v0.2.33...v0.2.34

Contributors

mjkaye, infwinston, and 14 other contributors

Assets 2

22 Nov 09:26

merrymercy

v0.2.33

0a5ad3e

Release v0.2.33

What's Changed

fix: Fix for OpenOrcaAdapter to return correct conversation template by @vjsrinath in #2613
Make fastchat.serve.model_worker to take debug argument by @uinone in #2628
openchat 3.5 model support by @imoneoi in #2638
xFastTransformer framework support by @a3213105 in #2615
feat: support custom models vllm serving by @congchan in #2635
kill only fastchat process by @scenaristeur in #2641
Use conv.update_last_message api in mt-bench answer generation by @merrymercy in #2647
Improve Azure OpenAI interface by @infwinston in #2651
Add required_temp support in jsonl format to support flexible temperature setting for gen_api_answer by @CodingWithTim in #2653
Pin openai version < 1 by @infwinston in #2658
Remove exclude_unset parameter by @snapshotpl in #2654
Revert "Remove exclude_unset parameter" by @merrymercy in #2666
added support for CodeGeex(2) by @peterwilli in #2645
add chatglm3 conv template support in conversation.py by @ZeyuTeng96 in #2622
UI and model change by @infwinston in #2672
train_flant5: fix typo by @Force1ess in #2673
Fix gpt template by @infwinston in #2674
Update README.md by @merrymercy in #2679
feat: support template's stop_str as list by @congchan in #2678
Update exllama_v2.md by @jm23jeffmorgan in #2680
save model under deepspeed by @MrZhengXin in #2689
Adding SSL support for model workers and huggingface worker by @lnguyen in #2687
Check the max_new_tokens <= 0 in openai api server by @zeyugao in #2688
Add Microsoft/Orca-2-7b and update model support docs by @BabyChouSr in #2714
fix tokenizer of chatglm2 by @wangshuai09 in #2711
Template for using Deepseek code models by @AmaleshV in #2705
add support for Chinese-LLaMA-Alpaca by @zollty in #2700
Make --load-8bit flag work with weights in safetensors format by @xuguodong1999 in #2698
Format code and minor bug fix by @merrymercy in #2716
Bump version to v0.2.33 by @merrymercy in #2717

New Contributors

@vjsrinath made their first contribution in #2613
@uinone made their first contribution in #2628
@a3213105 made their first contribution in #2615
@scenaristeur made their first contribution in #2641
@snapshotpl made their first contribution in #2654
@peterwilli made their first contribution in #2645
@ZeyuTeng96 made their first contribution in #2622
@Force1ess made their first contribution in #2673
@jm23jeffmorgan made their first contribution in #2680
@MrZhengXin made their first contribution in #2689
@lnguyen made their first contribution in #2687
@wangshuai09 made their first contribution in #2711
@AmaleshV made their first contribution in #2705
@zollty made their first contribution in #2700
@xuguodong1999 made their first contribution in #2698

Full Changelog: v0.2.32...v0.2.33

Contributors

snapshotpl, lnguyen, and 20 other contributors

Assets 2

01 Nov 09:22

merrymercy

v0.2.32

dd84d16

Release v0.2.32

What's Changed

Fix for single turn dataset by @toslunar in #2509
replace os.getenv with os.path.expanduser because the first one doesn… by @khalil-Hennara in #2515
Fix arena by @merrymercy in #2522
Update Dockerfile by @dubaoquan404 in #2524
add Llama2ChangAdapter by @lcw99 in #2510
Add ExllamaV2 Inference Framework Support. by @leonxia1018 in #2455
Improve docs by @merrymercy in #2534
Fix warnings for new gradio versions by @merrymercy in #2538
Improve chat templates by @merrymercy in #2539
Add Zephyr 7B Alpha by @lewtun in #2535
Improve Support for Mistral-Instruct by @Steve-Tech in #2547
correct max_tokens by context_length instead of raise exception by @liunux4odoo in #2544
Revert "Improve Support for Mistral-Instruct" by @merrymercy in #2552
Fix Mistral template by @normster in #2529
Add additional Informations from the vllm worker by @SebastianBodza in #2550
Make FastChat work with LMSYS-Chat-1M Code by @CodingWithTim in #2551
Create tags attribute to fix MarkupError in rich CLI by @Steve-Tech in #2553
move BaseModelWorker outside serve.model_worker to make it independent by @liunux4odoo in #2531
Misc style and bug fixes by @merrymercy in #2559
Fix README.md by @infwinston in #2561
release v0.2.31 by @merrymercy in #2563
resolves #2542 modify dockerfile to upgrade cuda to 12.2.0 and pydantic 1.10.13 by @alexdelapaz in #2565
Add airoboros_v3 chat template (llama-2 format) by @jondurbin in #2564
Add Xwin-LM V0.1, V0.2 support by @REIGN12 in #2566
Fixed model_worker generate_gate may blocked main thread (#2540) by @lvxuan263 in #2562
feat: add claude-v2 by @congchan in #2571
Update vigogne template by @bofenghuang in #2580
Fix issue #2568: --device mps led to TypeError: forward() got an unexpected keyword argument 'padding_mask'. by @Phil-U-U in #2579
Add Mistral-7B-OpenOrca conversation_temmplate by @waynespa in #2585
docs: bit misspell comments model adapter default template name conversation by @guspan-tanadi in #2594
Update Mistral template by @Gk-rohan in #2581
Update README.md (vicuna-v1.3 -> vicuna-1.5) by @infwinston in #2592
Update README.md to highlight chatbot arena by @infwinston in #2596
Add Lemur model by @ugolotti in #2584
add trust_remote_code=True in BaseModelAdapter by @edisonwd in #2583
Openai interface add use beam search and best of 2 by @leiwen83 in #2442
Update qwen and add pygmalion by @Trangle in #2607
feat: Support model AquilaChat2 by @fangyinc in #2616
Added settings vllm by @SebastianBodza in #2599
[Logprobs] Support logprobs=1 by @comaniac in #2612

New Contributors

@toslunar made their first contribution in #2509
@khalil-Hennara made their first contribution in #2515
@dubaoquan404 made their first contribution in #2524
@leonxia1018 made their first contribution in #2455
@lewtun made their first contribution in #2535
@normster made their first contribution in #2529
@SebastianBodza made their first contribution in #2550
@alexdelapaz made their first contribution in #2565
@REIGN12 made their first contribution in #2566
@lvxuan263 made their first contribution in #2562
@Phil-U-U made their first contribution in #2579
@waynespa made their first contribution in #2585
@guspan-tanadi made their first contribution in #2594
@Gk-rohan made their first contribution in #2581
@ugolotti made their first contribution in #2584
@edisonwd made their first contribution in #2583
@fangyinc made their first contribution in #2616
@comaniac made their first contribution in #2612

Full Changelog: v0.2.30...v0.2.32

Contributors

infwinston, jondurbin, and 27 other contributors

Assets 2

02 Oct 22:53

merrymercy

v0.2.30

f9fcc9d

Release v0.2.30

What's Changed

Support new models
Bug fixes

New Contributors

@wangxiyuan made their first contribution in #2404
@wangzhen263 made their first contribution in #2402
@karshPrime made their first contribution in #2406
@obitolyz made their first contribution in #2408
@Somezak1 made their first contribution in #2431
@hi-jin made their first contribution in #2434
@zhangsibo1129 made their first contribution in #2422
@tobiabir made their first contribution in #2418
@Btlmd made their first contribution in #2384
@brandonbiggs made their first contribution in #2448
@dongxiaolong made their first contribution in #2463
@shuishu made their first contribution in #2482
@asaiacai made their first contribution in #2469
@hnyls2002 made their first contribution in #2456
@enochlev made their first contribution in #2499
@AlpinDale made their first contribution in #2500
@lerela made their first contribution in #2483

Full Changelog: v0.2.28...v0.2.30

Contributors

wangzhen263, lerela, and 15 other contributors

Assets 2

11 Sep 23:38

merrymercy

v0.2.28

11b05bb

Release v0.2.28

What's Changed

Multiple UI updates, performance improvements and bug fixes
New model support (Spicyboros + airoboros 2.2, VMware's OpenLLaMa OpenInstruct)
Add sponsors (Kaggle, MBZUAI, AnyScale, and Huggingface)

New Contributors

@nicobasile made their first contribution in #2278
@zeyugao made their first contribution in #2297
@fan-chao made their first contribution in #2290
@leiwen83 made their first contribution in #2273
@siddartha-RE made their first contribution in #2263
@renatz made their first contribution in #2296
@so2liu made their first contribution in #2225
@epec254 made their first contribution in #2306
@woshiyyya made their first contribution in #2326
@vaxilicaihouxian made their first contribution in #2328
@nathanstitt made their first contribution in #2337

Full Changelog: v0.2.25...v0.2.28

Contributors

nathanstitt, zeyugao, and 9 other contributors

Assets 2

21 Aug 07:58

merrymercy

v0.2.25

73163b0

v0.2.25

What's Changed

Support new models (Qwen, WizardCoder, Llama2-Chinese, BAAI/AquilaChat-7B, OpenOrca, BGE)
Improve performance and usability. Fix bugs.
Reduce dependency by making some required packages optional

New Contributors

@azshue made their first contribution in #2169
@liunux4odoo made their first contribution in #2147
@tmm1 made their first contribution in #2126
@shibing624 made their first contribution in #2138
@Tomorrowxxy made their first contribution in #2185
@Extremys made their first contribution in #2166
@gesanqiu made their first contribution in #2192
@alongLFB made their first contribution in #2202
@congchan made their first contribution in #2194
@Rayrtfr made their first contribution in #2218
@bofenghuang made their first contribution in #2236
@Cyrilvallez made their first contribution in #2239
@persistz made their first contribution in #2248
@LeiZhou-97 made their first contribution in #2247

Full Changelog: v0.2.23...v0.2.25

Contributors

tmm1, Extremys, and 12 other contributors

Assets 2

01 Aug 17:26

merrymercy

v0.2.22

09974a4

Release v0.2.22

Released Vicuna v1.5 based on Llama 2 with 4K and 16K context lengths. Download weights
Released Chatbot Arena Conversations, a dataset containing 33k conversations with human preferences. Download it here.
Serving
- Add a multi-model worker that can host multiple models on a single GPU and share base weights for PEFT models. #1866 #1905
- AWQ 4-bit quantization support. #2103
- Support model models (Llama 2, Claude 2, ChatGLM 2, StarChat, Baichuan-13B, InternLM, airoboros, PEFT adapters).
- Better support for AMD GPUs, Intel XPUs. #1954 #2052
Training
- Support rope scaling. #2013
- Support flash attention 2. #2059
- Support xformer. #1970

Assets 2

05 Jul 11:13

Ying1123

v0.2.18

d578599

Release v0.2.18

Release MT-bench code and data
- Demo, Leaderboard, Human annotation data
Release new models
- Vicuna-33B v1.3 and LongChat
Support more models (Falcon, Salesforce/xgen, Salesforce/codet5p-6b, Robin-7B/13B/33B, Baichuan-7B)
Integrate vLLM worker for continuous batching and high-throughput serving. See doc.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Highlights

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

Releases: lm-sys/FastChat

Release v0.2.36

Highlights

What's Changed

New Contributors

Contributors

Release v0.2.35

What's Changed

New Contributors

Contributors

Release v0.2.34

What's Changed

New Contributors

Contributors

Release v0.2.33

What's Changed

New Contributors

Contributors

Release v0.2.32

What's Changed

New Contributors

Contributors

Release v0.2.30

What's Changed

New Contributors

Contributors

Release v0.2.28

What's Changed

New Contributors

Contributors

v0.2.25

What's Changed

New Contributors

Contributors

Release v0.2.22

Release v0.2.18