Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Doc] Add model development API Reference #11884

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .buildkite/test-pipeline.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ steps:
- pip install -r requirements-docs.txt
- SPHINXOPTS=\"-W\" make html
# Check API reference (if it fails, you may have missing mock imports)
- grep \"sig sig-object py\" build/html/api/params.html
- grep \"sig sig-object py\" build/html/api/inference_params.html

- label: Async Engine, Inputs, Utils, Worker Test # 24min
fast_check: true
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Optional Parameters
# Inference Parameters

Optional parameters for vLLM APIs.
Inference parameters for vLLM APIs.

(sampling-params)=

Expand All @@ -19,4 +19,3 @@ Optional parameters for vLLM APIs.
.. autoclass:: vllm.PoolingParams
:members:
```

9 changes: 9 additions & 0 deletions docs/source/api/model/adapters.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Model Adapters

## Module Contents

```{eval-rst}
.. automodule:: vllm.model_executor.models.adapters
:members:
:member-order: bysource
```
12 changes: 12 additions & 0 deletions docs/source/api/model/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Model Development

## Submodules

```{toctree}
:maxdepth: 1

interfaces_base
interfaces
adapters
```

9 changes: 9 additions & 0 deletions docs/source/api/model/interfaces.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Optional Interfaces

## Module Contents

```{eval-rst}
.. automodule:: vllm.model_executor.models.interfaces
:members:
:member-order: bysource
```
9 changes: 9 additions & 0 deletions docs/source/api/model/interfaces_base.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Base Model Interfaces

## Module Contents

```{eval-rst}
.. automodule:: vllm.model_executor.models.interfaces_base
:members:
:member-order: bysource
```
3 changes: 2 additions & 1 deletion docs/source/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -139,8 +139,9 @@ community/sponsors

api/offline_inference/index
api/engine/index
api/inference_params
api/multimodal/index
api/params
api/model/index
```

% Design Documents: Details about vLLM internals
Expand Down
11 changes: 7 additions & 4 deletions vllm/model_executor/models/interfaces.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,13 +38,15 @@ def get_multimodal_embeddings(self, **kwargs) -> Optional[T]:
to be merged with text embeddings.

The output embeddings must be one of the following formats:

- A list or tuple of 2D tensors, where each tensor corresponds to
each input multimodal data item (e.g, image).
each input multimodal data item (e.g, image).
- A single 3D tensor, with the batch dimension grouping the 2D tensors.

NOTE: The returned multimodal embeddings must be in the same order as
the appearances of their corresponding multimodal data item in the
input prompt.
Note:
The returned multimodal embeddings must be in the same order as
the appearances of their corresponding multimodal data item in the
input prompt.
"""
...

Expand All @@ -59,6 +61,7 @@ def get_input_embeddings(
) -> torch.Tensor:
...

@overload
def get_input_embeddings(
self,
input_ids: torch.Tensor,
Expand Down
3 changes: 3 additions & 0 deletions vllm/model_executor/models/interfaces_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@

@runtime_checkable
class VllmModel(Protocol[C_co, T_co]):
"""The interface required for all models in vLLM."""

def __init__(
self,
Expand Down Expand Up @@ -97,6 +98,7 @@ def is_vllm_model(

@runtime_checkable
class VllmModelForTextGeneration(VllmModel[C_co, T], Protocol[C_co, T]):
"""The interface required for all generative models in vLLM."""

def compute_logits(
self,
Expand Down Expand Up @@ -142,6 +144,7 @@ def is_text_generation_model(

@runtime_checkable
class VllmModelForPooling(VllmModel[C_co, T], Protocol[C_co, T]):
"""The interface required for all pooling models in vLLM."""

def pooler(
self,
Expand Down
Loading