[Doc]: Update default max_num_batch_tokens for chunked prefill #11319

toslunar · 2024-12-19T04:29:52Z

#10544 only fixed the code. The value in the documentation (https://docs.vllm.ai/en/v0.6.5/usage/performance.html) could be updated, too.

No response

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

simon-mo · 2024-12-20T21:35:58Z

PR welcomed. Thanks!

SachinVarghese · 2025-01-02T20:33:16Z

Created PR #11694. PTAL

toslunar added the documentation Improvements or additions to documentation label Dec 19, 2024

SachinVarghese mentioned this issue Jan 2, 2025

Update default max_num_batch_tokens for chunked prefill #11694

Merged

mgoin closed this as completed in #11694 Jan 3, 2025

Provide feedback