Skip to content

Commit

Permalink
Update
Browse files Browse the repository at this point in the history
Signed-off-by: DarkLight1337 <[email protected]>
  • Loading branch information
DarkLight1337 committed Jan 6, 2025
1 parent 22677bc commit b831cf3
Show file tree
Hide file tree
Showing 4 changed files with 62 additions and 46 deletions.
16 changes: 5 additions & 11 deletions docs/source/contributing/model/basic.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,18 +6,12 @@ This guide walks you through the steps to implement a basic vLLM model.

## 1. Bring your model code

Start by forking our [GitHub repository](https://github.com/vllm-project/vllm) and then [build it from source](#build-from-source).
This gives you the ability to modify the codebase and test your model.

Clone the PyTorch model code from the HuggingFace Transformers repository and put it into the <gh-dir:vllm/model_executor/models> directory.
For instance, vLLM's [OPT model](gh-file:vllm/model_executor/models/opt.py) was adapted from the HuggingFace's [modeling_opt.py](https://github.com/huggingface/transformers/blob/main/src/transformers/models/opt/modeling_opt.py) file.
First, clone the PyTorch model code from the source repository.
For instance, vLLM's [OPT model](gh-file:vllm/model_executor/models/opt.py) was adapted from
HuggingFace's [modeling_opt.py](https://github.com/huggingface/transformers/blob/main/src/transformers/models/opt/modeling_opt.py) file.

```{warning}
When copying the model code, make sure to review and adhere to the code's copyright and licensing terms.
```

```{tip}
If you don't want to fork the repository and modify vLLM's codebase, please refer to [Out-of-Tree Model Integration](#new-model-oot).
Make sure to review and adhere to the original code's copyright and licensing terms!
```

## 2. Make your code compatible with vLLM
Expand Down Expand Up @@ -105,4 +99,4 @@ This method should load the weights from the HuggingFace's checkpoint file and a

## 5. Register your model

Finally, add your `*ForCausalLM` class to `_VLLM_MODELS` in <gh-file:vllm/model_executor/models/registry.py> so that it is available by default.
See [this page](#new-model-registration) for instructions on how to register your new model to be used by vLLM.
2 changes: 1 addition & 1 deletion docs/source/contributing/model/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ This section provides more information on how to integrate a [HuggingFace Transf
basic
multimodal
oot
registration
```

```{note}
Expand Down
34 changes: 0 additions & 34 deletions docs/source/contributing/model/oot.md

This file was deleted.

56 changes: 56 additions & 0 deletions docs/source/contributing/model/registration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
(new-model-registration)=

# Model Registration

vLLM relies on a model registry to determine how to run each model.
A list of pre-registered architectures can be found on the [Supported Models](#supported-mm-models) page.

If your model is not on this list, you must register it to vLLM.
This page provides detailed instructions on how to do so.

## Built-in models

To add a model directly to the vLLM library, start by forking our [GitHub repository](https://github.com/vllm-project/vllm) and then [build it from source](#build-from-source).
This gives you the ability to modify the codebase and test your model.

After you have implemented your model (see [tutorial](#new-model-basic)), put it into the <gh-dir:vllm/model_executor/models> directory.
Then, add your model class to `_VLLM_MODELS` in <gh-file:vllm/model_executor/models/registry.py> so that it is automatically registered upon importing vLLM.
You should also include an example HuggingFace repository for this model in <gh-file:tests/models/registry.py> to run the unit tests.
Finally, update the [Supported Models](#supported-mm-models) documentation page to promote your model!

```{important}
The list of models in each section should be maintained in alphabetical order.
```

## Out-of-tree models

You can load an external model using a plugin without modifying the vLLM codebase.

```{seealso}
[vLLM's Plugin System](#plugin-system)
```

To register the model, use the following code:

```python
from vllm import ModelRegistry
from your_code import YourModelForCausalLM
ModelRegistry.register_model("YourModelForCausalLM", YourModelForCausalLM)
```

If your model imports modules that initialize CUDA, consider lazy-importing it to avoid errors like `RuntimeError: Cannot re-initialize CUDA in forked subprocess`:

```python
from vllm import ModelRegistry

ModelRegistry.register_model("YourModelForCausalLM", "your_code:YourModelForCausalLM")
```

```{important}
If your model is a multimodal model, ensure the model class implements the {class}`~vllm.model_executor.models.interfaces.SupportsMultiModal` interface.
Read more about that [here](#enabling-multimodal-inputs).
```

```{note}
Although you can directly put these code snippets in your script using `vllm.LLM`, the recommended way is to place these snippets in a vLLM plugin. This ensures compatibility with various vLLM features like distributed inference and the API server.
```

0 comments on commit b831cf3

Please sign in to comment.