vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 5.2k
Star 34k

Code
Issues 1.2k
Pull requests 458
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Pull requests: vllm-project/vllm

Labels 56 Milestones 0

New pull request New

458 Open 5,239 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[Misc] Add Gemma2 GGUF support

#12186 opened Jan 18, 2025 by Isotr0py • Draft

[Kernel] add triton fused moe kernel for gptq/awq

#12185 opened Jan 18, 2025 by jinzhen-lin

Loading…

[Misc] Add BNB support to GLM4-V model ready

ONLY add when PR is ready to merge/full CI is needed

#12184 opened Jan 18, 2025 by Isotr0py

Loading…

[torch.compile] store inductor compiled Python file ready

ONLY add when PR is ready to merge/full CI is needed

#12182 opened Jan 18, 2025 by youkaichao

Loading…

[Hardware][Gaudi][Bugfix] Fix HPU tensor parallelism, enable multiprocessing executor

#12167 opened Jan 17, 2025 by kzawora-intel

Loading…

[Quantization/Parameter] WIP: Another Implementation of the Quantization Parameter Subclass Substitution

#12158 opened Jan 17, 2025 by cennn

Loading…

[Core] Optimize topp/topk calculation in sampler

#12156 opened Jan 17, 2025 by afierka-intel • Draft

[WIP][Hardware][CPU] testing branch for mlperf ci/build documentation

Improvements or additions to documentation

needs-rebase

#12141 opened Jan 17, 2025 by bigPYJ1151 • Draft

[Hardware][Gaudi][Feature] Support Contiguous PA

#12139 opened Jan 17, 2025 by zhouyu5 • Draft

[WIP] Multimodal model support for V1 TPU

#12133 opened Jan 16, 2025 by mgoin • Draft

[V1] Add V1 support of Qwen2-VL documentation

Improvements or additions to documentation

ready

ONLY add when PR is ready to merge/full CI is needed

#12128 opened Jan 16, 2025 by ywang96

Loading…

[Misc] Update to Transformers 4.48 ci/build ready

ONLY add when PR is ready to merge/full CI is needed

#12120 opened Jan 16, 2025 by tlrmchlsmth

Loading…

[Feature] Support VPTQ quantization ci/build

#12117 opened Jan 16, 2025 by wejoncy • Draft

[BUILD] Add VLLM_BUILD_EXT to control custom op build ci/build

#12116 opened Jan 16, 2025 by MengqingCao

Loading…

benchmark_serving support --served-model-name param

#12109 opened Jan 16, 2025 by gujingit

Loading…

[Misc]add modules_to_not_convert attribute to gptq series

#12103 opened Jan 16, 2025 by 1096125073

Loading…

Use CUDA 12.4 as default for release and nightly wheels ci/build documentation

Improvements or additions to documentation

#12098 opened Jan 15, 2025 by mgoin

Loading…

Add: Support for Sparse24Bitmask Compressed Models

#12097 opened Jan 15, 2025 by rahul-tuli • Draft

1 task

[V1][Perf] Reduce scheduling overhead in model runner after cuda sync

#12094 opened Jan 15, 2025 by youngkent

Loading…

[WIP][Kernel] Flash Attention 3 Support ci/build

#12093 opened Jan 15, 2025 by LucasWilkinson • Draft

[V1][WIP] Add KV cache group dimension to block table

#12086 opened Jan 15, 2025 by heheda12345 • Draft

[V1] Add notes on test_async_engine.py::test_abort

#12081 opened Jan 15, 2025 by heheda12345

Loading…

[V1] Optimize block table copy from CPU to GPU (take 2) ci/build

#12078 opened Jan 15, 2025 by WoosukKwon • Draft

[Bugfix] Fix num_heads value for simple connector when tp enabled ready

ONLY add when PR is ready to merge/full CI is needed

#12074 opened Jan 15, 2025 by ShangmingCai

Loading…

[Model] Refactoring of MiniCPM-V and add MiniCPM-o-2.6 support for vLLM

#12069 opened Jan 15, 2025 by HwwwwwwwH • Draft

5 of 11 tasks

Previous 1 2 3 4 5 … 18 19 Next

Previous Next

ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly