Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Doc][4/N] Reorganize API Reference #11843

Merged
merged 2 commits into from
Jan 8, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .buildkite/test-pipeline.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ steps:
- pip install -r requirements-docs.txt
- SPHINXOPTS=\"-W\" make html
# Check API reference (if it fails, you may have missing mock imports)
- grep \"sig sig-object py\" build/html/dev/sampling_params.html
- grep \"sig sig-object py\" build/html/api/params.html

- label: Async Engine, Inputs, Utils, Worker Test # 24min
fast_check: true
Expand Down
4 changes: 2 additions & 2 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@
# to run the OpenAI compatible server.

# Please update any changes made here to
# docs/source/dev/dockerfile/dockerfile.md and
# docs/source/assets/dev/dockerfile-stages-dependency.png
# docs/source/contributing/dockerfile/dockerfile.md and
# docs/source/assets/contributing/dockerfile-stages-dependency.png

ARG CUDA_VERSION=12.4.1
#################### BASE BUILD IMAGE ####################
Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -11,18 +11,8 @@ vLLM provides experimental support for multi-modal models through the {mod}`vllm
Multi-modal inputs can be passed alongside text and token prompts to [supported models](#supported-mm-models)
via the `multi_modal_data` field in {class}`vllm.inputs.PromptType`.

Currently, vLLM only has built-in support for image data. You can extend vLLM to process additional modalities
by following [this guide](#adding-multimodal-plugin).

Looking to add your own multi-modal model? Please follow the instructions listed [here](#enabling-multimodal-inputs).

## Guides

```{toctree}
:maxdepth: 1

adding_multimodal_plugin
```

## Module Contents

Expand Down
22 changes: 22 additions & 0 deletions docs/source/api/params.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Optional Parameters

Optional parameters for vLLM APIs.

(sampling-params)=

## Sampling Parameters

```{eval-rst}
.. autoclass:: vllm.SamplingParams
:members:
```

(pooling-params)=

## Pooling Parameters

```{eval-rst}
.. autoclass:: vllm.PoolingParams
:members:
```

2 changes: 1 addition & 1 deletion docs/source/contributing/dockerfile/dockerfile.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ The edges of the build graph represent:

- `RUN --mount=(.\*)from=...` dependencies (with a dotted line and an empty diamond arrow head)

> ```{figure} ../../assets/dev/dockerfile-stages-dependency.png
> ```{figure} /assets/contributing/dockerfile-stages-dependency.png
> :align: center
> :alt: query
> :width: 100%
Expand Down
2 changes: 1 addition & 1 deletion docs/source/design/arch_overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ for output in outputs:
```

More API details can be found in the {doc}`Offline Inference
</dev/offline_inference/offline_index>` section of the API docs.
</api/offline_inference/index>` section of the API docs.

The code for the `LLM` class can be found in <gh-file:vllm/entrypoints/llm.py>.

Expand Down
16 changes: 0 additions & 16 deletions docs/source/design/multimodal/adding_multimodal_plugin.md

This file was deleted.

6 changes: 0 additions & 6 deletions docs/source/dev/pooling_params.md

This file was deleted.

6 changes: 0 additions & 6 deletions docs/source/dev/sampling_params.md

This file was deleted.

2 changes: 1 addition & 1 deletion docs/source/getting_started/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ The first line of this example imports the classes {class}`~vllm.LLM` and {class
from vllm import LLM, SamplingParams
```

The next section defines a list of input prompts and sampling parameters for text generation. The [sampling temperature](https://arxiv.org/html/2402.05201v1) is set to `0.8` and the [nucleus sampling probability](https://en.wikipedia.org/wiki/Top-p_sampling) is set to `0.95`. You can find more information about the sampling parameters [here](https://docs.vllm.ai/en/stable/dev/sampling_params.html).
The next section defines a list of input prompts and sampling parameters for text generation. The [sampling temperature](https://arxiv.org/html/2402.05201v1) is set to `0.8` and the [nucleus sampling probability](https://en.wikipedia.org/wiki/Top-p_sampling) is set to `0.95`. You can find more information about the sampling parameters [here](#sampling-params).

```python
prompts = [
Expand Down
9 changes: 4 additions & 5 deletions docs/source/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -137,10 +137,10 @@ community/sponsors
:caption: API Reference
:maxdepth: 2
dev/sampling_params
dev/pooling_params
dev/offline_inference/offline_index
dev/engine/engine_index
api/offline_inference/index
api/engine/index
api/multimodal/index
api/params
```

% Design Documents: Details about vLLM internals
Expand All @@ -154,7 +154,6 @@ design/huggingface_integration
design/plugin_system
design/kernel/paged_attention
design/input_processing/model_inputs_index
design/multimodal/multimodal_index
design/automatic_prefix_caching
design/multiprocessing
```
Expand Down
2 changes: 1 addition & 1 deletion docs/source/serving/offline_inference.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ The available APIs depend on the type of model that is being run:
Please refer to the above pages for more details about each API.

```{seealso}
[API Reference](/dev/offline_inference/offline_index)
[API Reference](/api/offline_inference/index)
```

## Configuration Options
Expand Down
8 changes: 4 additions & 4 deletions docs/source/serving/openai_compatible_server.md
Original file line number Diff line number Diff line change
Expand Up @@ -195,7 +195,7 @@ Code example: <gh-file:examples/openai_completion_client.py>

#### Extra parameters

The following [sampling parameters (click through to see documentation)](../dev/sampling_params.md) are supported.
The following [sampling parameters](#sampling-params) are supported.

```{literalinclude} ../../../vllm/entrypoints/openai/protocol.py
:language: python
Expand Down Expand Up @@ -226,7 +226,7 @@ Code example: <gh-file:examples/openai_chat_completion_client.py>

#### Extra parameters

The following [sampling parameters (click through to see documentation)](../dev/sampling_params.md) are supported.
The following [sampling parameters](#sampling-params) are supported.

```{literalinclude} ../../../vllm/entrypoints/openai/protocol.py
:language: python
Expand Down Expand Up @@ -259,7 +259,7 @@ Code example: <gh-file:examples/openai_embedding_client.py>

#### Extra parameters

The following [pooling parameters (click through to see documentation)](../dev/pooling_params.md) are supported.
The following [pooling parameters](#pooling-params) are supported.

```{literalinclude} ../../../vllm/entrypoints/openai/protocol.py
:language: python
Expand Down Expand Up @@ -447,7 +447,7 @@ Response:

#### Extra parameters

The following [pooling parameters (click through to see documentation)](../dev/pooling_params.md) are supported.
The following [pooling parameters](#pooling-params) are supported.

```{literalinclude} ../../../vllm/entrypoints/openai/protocol.py
:language: python
Expand Down
3 changes: 0 additions & 3 deletions vllm/multimodal/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,9 +49,6 @@ class MultiModalPlugin(ABC):
process the same data differently). This registry is in turn used by
:class:`~MultiModalRegistry` which acts at a higher level
(i.e., the modality of the data).

See also:
:ref:`adding-multimodal-plugin`
"""

def __init__(self) -> None:
Expand Down
6 changes: 0 additions & 6 deletions vllm/multimodal/inputs.py
Original file line number Diff line number Diff line change
Expand Up @@ -99,12 +99,6 @@ class MultiModalDataBuiltins(TypedDict, total=False):
MultiModalDataDict: TypeAlias = Mapping[str, ModalityData[Any]]
"""
A dictionary containing an entry for each modality type to input.

Note:
This dictionary also accepts modality keys defined outside
:class:`MultiModalDataBuiltins` as long as a customized plugin
is registered through the :class:`~vllm.multimodal.MULTIMODAL_REGISTRY`.
Read more on that :ref:`here <adding-multimodal-plugin>`.
"""


Expand Down
3 changes: 0 additions & 3 deletions vllm/multimodal/registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -125,9 +125,6 @@ def __init__(
def register_plugin(self, plugin: MultiModalPlugin) -> None:
"""
Register a multi-modal plugin so it can be recognized by vLLM.

See also:
:ref:`adding-multimodal-plugin`
"""
data_type_key = plugin.get_data_key()

Expand Down
2 changes: 1 addition & 1 deletion vllm/pooling_params.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ class PoolingParams(
msgspec.Struct,
omit_defaults=True, # type: ignore[call-arg]
array_like=True): # type: ignore[call-arg]
"""Pooling parameters for embeddings API.
"""API parameters for pooling models. This is currently a placeholder.

Attributes:
additional_data: Any additional data needed for pooling.
Expand Down
Loading