-
-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Signed-off-by: DarkLight1337 <[email protected]>
- Loading branch information
1 parent
22677bc
commit b831cf3
Showing
4 changed files
with
62 additions
and
46 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
(new-model-registration)= | ||
|
||
# Model Registration | ||
|
||
vLLM relies on a model registry to determine how to run each model. | ||
A list of pre-registered architectures can be found on the [Supported Models](#supported-mm-models) page. | ||
|
||
If your model is not on this list, you must register it to vLLM. | ||
This page provides detailed instructions on how to do so. | ||
|
||
## Built-in models | ||
|
||
To add a model directly to the vLLM library, start by forking our [GitHub repository](https://github.com/vllm-project/vllm) and then [build it from source](#build-from-source). | ||
This gives you the ability to modify the codebase and test your model. | ||
|
||
After you have implemented your model (see [tutorial](#new-model-basic)), put it into the <gh-dir:vllm/model_executor/models> directory. | ||
Then, add your model class to `_VLLM_MODELS` in <gh-file:vllm/model_executor/models/registry.py> so that it is automatically registered upon importing vLLM. | ||
You should also include an example HuggingFace repository for this model in <gh-file:tests/models/registry.py> to run the unit tests. | ||
Finally, update the [Supported Models](#supported-mm-models) documentation page to promote your model! | ||
|
||
```{important} | ||
The list of models in each section should be maintained in alphabetical order. | ||
``` | ||
|
||
## Out-of-tree models | ||
|
||
You can load an external model using a plugin without modifying the vLLM codebase. | ||
|
||
```{seealso} | ||
[vLLM's Plugin System](#plugin-system) | ||
``` | ||
|
||
To register the model, use the following code: | ||
|
||
```python | ||
from vllm import ModelRegistry | ||
from your_code import YourModelForCausalLM | ||
ModelRegistry.register_model("YourModelForCausalLM", YourModelForCausalLM) | ||
``` | ||
|
||
If your model imports modules that initialize CUDA, consider lazy-importing it to avoid errors like `RuntimeError: Cannot re-initialize CUDA in forked subprocess`: | ||
|
||
```python | ||
from vllm import ModelRegistry | ||
|
||
ModelRegistry.register_model("YourModelForCausalLM", "your_code:YourModelForCausalLM") | ||
``` | ||
|
||
```{important} | ||
If your model is a multimodal model, ensure the model class implements the {class}`~vllm.model_executor.models.interfaces.SupportsMultiModal` interface. | ||
Read more about that [here](#enabling-multimodal-inputs). | ||
``` | ||
|
||
```{note} | ||
Although you can directly put these code snippets in your script using `vllm.LLM`, the recommended way is to place these snippets in a vLLM plugin. This ensures compatibility with various vLLM features like distributed inference and the API server. | ||
``` |