[MFM-20250115] Merge from ROCm/main to llama_fp8#360

Merged

hongxiayang merged 537 commits intoROCm:llama_fp8_12062024from EmbeddedLLM:main-to-llama-fp8

Jan 15, 2025

+59,614-29,878

This pull request is big! We're only showing the most recent 250 commits

Commits on Dec 26, 2024

[Misc] Move some multimodal utils to modality-specific modules (vllm-project#11494 )
DarkLight1337
authored
Mypy checking for vllm/compilation (vllm-project#11496 )

lucas-tucker
and
lucast2021
authored
[Misc][LoRA] Fix LoRA weight mapper (vllm-project#11495 )
jeejeelee
authored
[Doc] Add QVQ and QwQ to the list of supported models (vllm-project#11509 )

ywang96
and
DarkLight1337
authored
[V1] Adding min tokens/repetition/presence/frequence penalties to V1 sampler (vllm-project#10681 )

sroy745
and
WoosukKwon
authored
[Model] Modify MolmoForCausalLM MLP (vllm-project#11510 )
jeejeelee
authored
[Misc] Add placeholder module (vllm-project#11501 )
DarkLight1337
authored
[Doc] Add video example to openai client for multimodal (vllm-project#11521 )

Isotr0py
and
DarkLight1337
authored
[1/N] API Server (Remove Proxy) (vllm-project#11529 )
robertgshaw2-redhat
authored
[Model] [Quantization] Support deepseek_v3 w8a8 fp8 block-wise quantization (vllm-project#11523 )

authored
[2/N] API Server: Avoid ulimit footgun (vllm-project#11530 )
robertgshaw2-redhat
authored

Commits on Dec 27, 2024

Commits on Dec 28, 2024

Commits on Dec 30, 2024

Commits on Dec 31, 2024

Commits on Jan 1, 2025

Commits on Jan 2, 2025

Commits on Jan 3, 2025

Commits on Jan 4, 2025

Commits on Jan 6, 2025

Commits on Jan 7, 2025

Commits on Jan 8, 2025

Commits on Jan 9, 2025

Commits on Jan 10, 2025

Commits on Jan 11, 2025

Commits on Jan 12, 2025

Commits on Jan 13, 2025

Commits on Jan 14, 2025