Ascend NPU plugin for vLLM

Use Docker

1. Download vllm and vllm_ascend

git clone https://github.com/cosdt/vllm-ascend
cd vllm-ascend
git clone https://github.com/cosdt/vllm -b apply_plugin

2. Build Docker Image

Note

Modify the base image in Dockerfile according to your device. More choices could be found at https://hub.docker.com/r/ascendai/cann

docker build -t vllm-npu .

3. Run docker container

Note

Modify --device /dev/davinci0 according to your device.

docker run -dit -v /usr/local/dcmi:/usr/local/dcmi -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi -v /usr/local/Ascend/driver:/usr/local/Ascend/driver -v /etc/ascend_install.info:/etc/ascend_install.info --device /dev/davinci0 --device /dev/davinci_manager --device /dev/devmm_svm --device /dev/hisi_hdc --shm-size 16G --name vllm vllm-npu:latest bash

4. Enter the container

docker exec -it vllm bash

Install from source

1. Prepare CANN env

Before install vllm_ascend, you need to install the Ascend CANN Toolkit and Kernels. Please follow the installation tutorial or use the following commands for quick installation:

# replace the url according to your CANN version and devices
# install CANN Toolkit
wget https://ascend-repo.obs.cn-east-2.myhuaweicloud.com/Milan-ASL/Milan-ASL%20V100R001C17SPC701/Ascend-cann-toolkit_8.0.RC3.alpha003_linux-"$(uname -i)".run
bash Ascend-cann-toolkit_8.0.RC1.alpha003_linux-"$(uname -i)".run --install

# install CANN Kernels
wget https://ascend-repo.obs.cn-east-2.myhuaweicloud.com/Milan-ASL/Milan-ASL%20V100R001C17SPC701/Ascend-cann-kernels-910b_8.0.RC1.alpha003_linux.run
bash Ascend-cann-kernels-910b_8.0.RC1.alpha003_linux.run --install

# set env variables
source /usr/local/Ascend/ascend-toolkit/set_env.sh

2. Install vLLM cpu

git clone https://github.com/cosdt/vllm -b apply_plugin
cd vllm

sudo apt-get update  -y
sudo apt-get install -y gcc-12 g++-12 libnuma-dev
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-12 10 --slave /usr/bin/g++ g++ /usr/bin/g++-12

pip install cmake>=3.26 wheel packaging ninja "setuptools-scm>=8" numpy
pip install -r requirements-cpu.txt

VLLM_TARGET_DEVICE=cpu python setup.py install

Note

Ubuntu 22.04 is highly recommended as the installation on Ubuntu 20.04 may come across some errors.

3. Install vllm_ascend

git clone https://github.com/cosdt/vllm-ascend
cd vllm-ascend
pip install -e .

Requirement	Minimum	Recommend
CANN	8.0.RC2	8.0.RC3
torch	2.4.0	2.5.1
torch-npu	2.4.0	2.5.1rc3

Note

Torch 2.5.1 is highly recommended because vLLM strongly depends on it.

Support Devices

Atlas 800I A2 Inference Server
Atlas 800T A2 Training Server
Atals 300T A2 Training Card

Contributing

Linting and formatting:

pip install -r requirements-lint.txt

# 1. Do work and commit your work.
# 2. Format files that differ from origin/main.
bash format.sh
# 3. Commit changed files with message 'Run yapf and ruff'
git commit -m "Run yapf and ruff"

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
.github		.github
examples		examples
tests		tests
tools		tools
vllm_ascend		vllm_ascend
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
DCO		DCO
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
format.sh		format.sh
mypy.ini		mypy.ini
packages.txt		packages.txt
requirements-dev.txt		requirements-dev.txt
requirements-lint.txt		requirements-lint.txt
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ascend NPU plugin for vLLM

Use Docker

1. Download vllm and vllm_ascend

2. Build Docker Image

3. Run docker container

4. Enter the container

Install from source

1. Prepare CANN env

2. Install vLLM cpu

3. Install vllm_ascend

Support Devices

Contributing

About

Releases

Packages

Contributors 4

Languages

License

cosdt/vllm-ascend

Folders and files

Latest commit

History

Repository files navigation

Ascend NPU plugin for vLLM

Use Docker

1. Download vllm and vllm_ascend

2. Build Docker Image

3. Run docker container

4. Enter the container

Install from source

1. Prepare CANN env

2. Install vLLM cpu

3. Install vllm_ascend

Support Devices

Contributing

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages