Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add docker-npu #4355

Merged
merged 8 commits into from
Jun 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 50 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -383,11 +383,6 @@ source /usr/local/Ascend/ascend-toolkit/set_env.sh
| torch-npu | 2.1.0 | 2.1.0.post3 |
| deepspeed | 0.13.2 | 0.13.2 |

Docker image:

- 32GB: [Download page](http://mirrors.cn-central-221.ovaijisuan.com/detail/130.html)
- 64GB: [Download page](http://mirrors.cn-central-221.ovaijisuan.com/detail/131.html)

Remember to use `ASCEND_RT_VISIBLE_DEVICES` instead of `CUDA_VISIBLE_DEVICES` to specify the device to use.

If you cannot infer model on NPU devices, try setting `do_sample: false` in the configurations.
Expand Down Expand Up @@ -424,17 +419,33 @@ llamafactory-cli webui

### Build Docker

#### Use Docker
For CUDA users:

```bash
docker-compose -f ./docker/docker-cuda/docker-compose.yml up -d
docker-compose exec llamafactory bash
```

For Ascend NPU users:

```bash
docker build -f ./Dockerfile \
docker-compose -f ./docker/docker-npu/docker-compose.yml up -d
docker-compose exec llamafactory bash
```

<details><summary>Build without Docker Compose</summary>

For CUDA users:

```bash
docker build -f ./docker/docker-cuda/Dockerfile \
--build-arg INSTALL_BNB=false \
--build-arg INSTALL_VLLM=false \
--build-arg INSTALL_DEEPSPEED=false \
--build-arg PIP_INDEX=https://pypi.org/simple \
-t llamafactory:latest .

docker run -it --gpus=all \
docker run -dit --gpus=all \
-v ./hf_cache:/root/.cache/huggingface/ \
-v ./data:/app/data \
-v ./output:/app/output \
Expand All @@ -443,15 +454,43 @@ docker run -it --gpus=all \
--shm-size 16G \
--name llamafactory \
llamafactory:latest

docker exec -it llamafactory bash
```

#### Use Docker Compose
For Ascend NPU users:

```bash
docker-compose up -d
docker-compose exec llamafactory bash
# Change docker image upon your environment
docker build -f ./docker/docker-npu/Dockerfile \
--build-arg INSTALL_DEEPSPEED=false \
--build-arg PIP_INDEX=https://pypi.org/simple \
-t llamafactory:latest .

# Change `device` upon your resources
docker run -dit \
-v ./hf_cache:/root/.cache/huggingface/ \
-v ./data:/app/data \
-v ./output:/app/output \
-v /usr/local/dcmi:/usr/local/dcmi \
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
-v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
-v /etc/ascend_install.info:/etc/ascend_install.info \
-p 7860:7860 \
-p 8000:8000 \
--device /dev/davinci0 \
--device /dev/davinci_manager \
--device /dev/devmm_svm \
--device /dev/hisi_hdc \
--shm-size 16G \
--name llamafactory \
llamafactory:latest

docker exec -it llamafactory bash
```

</details>

<details><summary>Details about volume</summary>

- hf_cache: Utilize Hugging Face cache on the host machine. Reassignable if a cache already exists in a different directory.
Expand Down
63 changes: 51 additions & 12 deletions README_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -360,7 +360,7 @@ pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/downl

<details><summary>昇腾 NPU 用户指南</summary>

在昇腾 NPU 设备上安装 LLaMA Factory 时,需要指定额外依赖项,使用 `pip install -e '.[torch-npu,metrics]'` 命令安装。此外,还需要安装 **[Ascend CANN Toolkit and Kernels](https://www.hiascend.com/developer/download/community/result?module=cann)**,安装方法请参考[安装教程](https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/80RC2alpha002/quickstart/quickstart/quickstart_18_0004.html)或使用以下命令:
在昇腾 NPU 设备上安装 LLaMA Factory 时,需要指定额外依赖项,使用 `pip install -e ".[torch-npu,metrics]"` 命令安装。此外,还需要安装 **[Ascend CANN Toolkit and Kernels](https://www.hiascend.com/developer/download/community/result?module=cann)**,安装方法请参考[安装教程](https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/80RC2alpha002/quickstart/quickstart/quickstart_18_0004.html)或使用以下命令:

```bash
# 请替换 URL 为 CANN 版本和设备型号对应的 URL
Expand All @@ -383,11 +383,6 @@ source /usr/local/Ascend/ascend-toolkit/set_env.sh
| torch-npu | 2.1.0 | 2.1.0.post3 |
| deepspeed | 0.13.2 | 0.13.2 |

Docker 镜像:

- 32GB:[下载地址](http://mirrors.cn-central-221.ovaijisuan.com/detail/130.html)
- 64GB:[下载地址](http://mirrors.cn-central-221.ovaijisuan.com/detail/131.html)

请使用 `ASCEND_RT_VISIBLE_DEVICES` 而非 `CUDA_VISIBLE_DEVICES` 来指定运算设备。

如果遇到无法正常推理的情况,请尝试设置 `do_sample: false`。
Expand Down Expand Up @@ -424,17 +419,33 @@ llamafactory-cli webui

### 构建 Docker

#### 使用 Docker
CUDA 用户:

```bash
docker-compose -f ./docker/docker-cuda/docker-compose.yml up -d
docker-compose exec llamafactory bash
```

昇腾 NPU 用户:

```bash
docker build -f ./Dockerfile \
docker-compose -f ./docker/docker-npu/docker-compose.yml up -d
docker-compose exec llamafactory bash
```

<details><summary>不使用 Docker Compose 构建</summary>

CUDA 用户:

```bash
docker build -f ./docker/docker-cuda/Dockerfile \
--build-arg INSTALL_BNB=false \
--build-arg INSTALL_VLLM=false \
--build-arg INSTALL_DEEPSPEED=false \
--build-arg PIP_INDEX=https://pypi.org/simple \
-t llamafactory:latest .

docker run -it --gpus=all \
docker run -dit --gpus=all \
-v ./hf_cache:/root/.cache/huggingface/ \
-v ./data:/app/data \
-v ./output:/app/output \
Expand All @@ -443,15 +454,43 @@ docker run -it --gpus=all \
--shm-size 16G \
--name llamafactory \
llamafactory:latest

docker exec -it llamafactory bash
```

#### 使用 Docker Compose
昇腾 NPU 用户:

```bash
docker-compose up -d
docker-compose exec llamafactory bash
# 根据您的环境选择镜像
docker build -f ./docker/docker-npu/Dockerfile \
--build-arg INSTALL_DEEPSPEED=false \
--build-arg PIP_INDEX=https://pypi.org/simple \
-t llamafactory:latest .

# 根据您的资源更改 `device`
docker run -dit \
-v ./hf_cache:/root/.cache/huggingface/ \
-v ./data:/app/data \
-v ./output:/app/output \
-v /usr/local/dcmi:/usr/local/dcmi \
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
-v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
-v /etc/ascend_install.info:/etc/ascend_install.info \
-p 7860:7860 \
-p 8000:8000 \
--device /dev/davinci0 \
--device /dev/davinci_manager \
--device /dev/devmm_svm \
--device /dev/hisi_hdc \
--shm-size 16G \
--name llamafactory \
llamafactory:latest

docker exec -it llamafactory bash
```

</details>

<details><summary>数据卷详情</summary>

- hf_cache:使用宿主机的 Hugging Face 缓存文件夹,允许更改为新的目录。
Expand Down
10 changes: 5 additions & 5 deletions Dockerfile → docker/docker-cuda/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,13 +12,14 @@ ARG PIP_INDEX=https://pypi.org/simple
WORKDIR /app

# Install the requirements
COPY requirements.txt /app/
COPY requirements.txt /app
RUN pip config set global.index-url $PIP_INDEX
RUN pip config set global.extra-index-url $PIP_INDEX
RUN python -m pip install --upgrade pip
RUN python -m pip install -r requirements.txt

# Copy the rest of the application into the image
COPY . /app/
COPY . /app

# Install the LLaMA Factory
RUN EXTRA_PACKAGES="metrics"; \
Expand All @@ -38,10 +39,9 @@ RUN EXTRA_PACKAGES="metrics"; \
VOLUME [ "/root/.cache/huggingface/", "/app/data", "/app/output" ]

# Expose port 7860 for the LLaMA Board
ENV GRADIO_SERVER_PORT 7860
EXPOSE 7860

# Expose port 8000 for the API service
ENV API_PORT 8000
EXPOSE 8000

# Launch LLaMA Board
CMD [ "llamafactory-cli", "webui" ]
4 changes: 2 additions & 2 deletions docker-compose.yml → docker/docker-cuda/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
services:
llamafactory:
build:
dockerfile: Dockerfile
context: .
dockerfile: ./docker/docker-cuda/Dockerfile
context: ../..
args:
INSTALL_BNB: false
INSTALL_VLLM: false
Expand Down
41 changes: 41 additions & 0 deletions docker/docker-npu/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Use the Ubuntu 22.04 image with CANN 8.0.rc1
# More versions can be found at https://hub.docker.com/r/cosdt/cann/tags
FROM cosdt/cann:8.0.rc1-910b-ubuntu22.04

ENV DEBIAN_FRONTEND=noninteractive

# Define installation arguments
ARG INSTALL_DEEPSPEED=false
ARG PIP_INDEX=https://pypi.org/simple

# Set the working directory
WORKDIR /app

# Install the requirements
COPY requirements.txt /app
RUN pip config set global.index-url $PIP_INDEX
RUN pip config set global.extra-index-url $PIP_INDEX
RUN python -m pip install --upgrade pip
RUN python -m pip install -r requirements.txt

# Copy the rest of the application into the image
COPY . /app

# Install the LLaMA Factory
RUN EXTRA_PACKAGES="torch-npu,metrics"; \
if [ "$INSTALL_DEEPSPEED" = "true" ]; then \
EXTRA_PACKAGES="${EXTRA_PACKAGES},deepspeed"; \
fi; \
pip install -e .[$EXTRA_PACKAGES] && \
pip uninstall -y transformer-engine flash-attn

# Set up volumes
VOLUME [ "/root/.cache/huggingface/", "/app/data", "/app/output" ]

# Expose port 7860 for the LLaMA Board
ENV GRADIO_SERVER_PORT 7860
EXPOSE 7860

# Expose port 8000 for the API service
ENV API_PORT 8000
EXPOSE 8000
30 changes: 30 additions & 0 deletions docker/docker-npu/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
services:
llamafactory:
build:
dockerfile: ./docker/docker-npu/Dockerfile
context: ../..
args:
INSTALL_DEEPSPEED: false
PIP_INDEX: https://pypi.org/simple
container_name: llamafactory
volumes:
- ./hf_cache:/root/.cache/huggingface/
- ./data:/app/data
- ./output:/app/output
- /usr/local/dcmi:/usr/local/dcmi
- /usr/local/bin/npu-smi:/usr/local/bin/npu-smi
- /usr/local/Ascend/driver:/usr/local/Ascend/driver
- /etc/ascend_install.info:/etc/ascend_install.info
ports:
- "7860:7860"
- "8000:8000"
ipc: host
tty: true
stdin_open: true
command: bash
devices:
- /dev/davinci0
- /dev/davinci_manager
- /dev/devmm_svm
- /dev/hisi_hdc
restart: unless-stopped