Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ROCm support #4970

Merged
merged 1 commit into from
Jul 26, 2024
Merged

Add ROCm support #4970

merged 1 commit into from
Jul 26, 2024

Conversation

HardAndHeavy
Copy link
Contributor

@HardAndHeavy HardAndHeavy commented Jul 25, 2024

Added a docker build with a demonstration of running the project on AMD video cards using ROCm.

Tested on AMD Radeon RX 7900 XTX.

@hiyouga hiyouga self-requested a review July 26, 2024 03:33
Copy link
Owner

@hiyouga hiyouga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@hiyouga hiyouga merged commit b8896b9 into hiyouga:main Jul 26, 2024
@hiyouga hiyouga added the solved This problem has been already solved label Jul 26, 2024
@fyr233
Copy link

fyr233 commented Sep 5, 2024

Hi, I'm trying this docker compose on AMD INSTINCT MI100. When I set INSTALL_VLLM: true in docker-compose.yml:

services:
  llamafactory:
    build:
      dockerfile: ./docker/docker-rocm/Dockerfile
      context: ../..
      args:
        INSTALL_BNB: true
        INSTALL_VLLM: true
        INSTALL_DEEPSPEED: true
        INSTALL_FLASHATTN: true

When installing vllm, it will auto install packages related to nvidia, which don't work:

#10 33.33 Collecting transformers<=4.45.0,>=4.41.2 (from llamafactory==0.8.4.dev0)
#10 33.40   Downloading https://pypi.tuna.tsinghua.edu.cn/packages/75/35/07c9879163b603f0e464b0f6e6e628a2340cfc7cdc5ca8e7d52d776710d4/transformers-4.44.2-py3-none-any.whl (9.5 MB)
#10 34.76      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.5/9.5 MB 7.1 MB/s eta 0:00:00
#10 34.95 Collecting openai>=1.0 (from vllm>=0.4.3->llamafactory==0.8.4.dev0)
#10 35.01   Downloading https://pypi.tuna.tsinghua.edu.cn/packages/5e/4d/affea11bd85ca69d9fdd15567495bb9088ac1c37498c95cb42d9ecd984ed/openai-1.43.0-py3-none-any.whl (365 kB)
#10 35.14 Collecting prometheus-client>=0.18.0 (from vllm>=0.4.3->llamafactory==0.8.4.dev0)
#10 35.21   Downloading https://pypi.tuna.tsinghua.edu.cn/packages/c7/98/745b810d822103adca2df8decd4c0bbe839ba7ad3511af3f0d09692fc0f0/prometheus_client-0.20.0-py3-none-any.whl (54 kB)
#10 35.27 Collecting prometheus-fastapi-instrumentator>=7.0.0 (from vllm>=0.4.3->llamafactory==0.8.4.dev0)
#10 35.30   Downloading https://pypi.tuna.tsinghua.edu.cn/packages/59/66/2e93a8f56adb51ede41d0ef5f4f0277522acc4adc87937f5457b7b5692a8/prometheus_fastapi_instrumentator-7.0.0-py3-none-any.whl (19 kB)
#10 35.42 Collecting lm-format-enforcer==0.10.6 (from vllm>=0.4.3->llamafactory==0.8.4.dev0)
#10 35.45   Downloading https://pypi.tuna.tsinghua.edu.cn/packages/4f/6e/d140b5eb41541afebea1c27013bc19b5a1cafd0cd330d9aa3458833ee44a/lm_format_enforcer-0.10.6-py3-none-any.whl (43 kB)
#10 35.61 Collecting outlines<0.1,>=0.0.43 (from vllm>=0.4.3->llamafactory==0.8.4.dev0)
#10 35.68   Downloading https://pypi.tuna.tsinghua.edu.cn/packages/fd/2c/1ce85b81b2f6835720db1fda9f2d4439a69458b119d2ad5fcf7cae573923/outlines-0.0.46-py3-none-any.whl (101 kB)
#10 36.17 Collecting typing-extensions~=4.0 (from gradio>=4.0.0->llamafactory==0.8.4.dev0)
#10 36.26   Downloading https://pypi.tuna.tsinghua.edu.cn/packages/26/9f/ad63fc0248c5379346306f8668cda6e2e2e9c95e01216d2b8ffd9ff037d0/typing_extensions-4.12.2-py3-none-any.whl (37 kB)
#10 36.31 Collecting partial-json-parser (from vllm>=0.4.3->llamafactory==0.8.4.dev0)
#10 36.35   Downloading https://pypi.tuna.tsinghua.edu.cn/packages/9c/93/791cbea58b8dc27dc76621438d7b030741a4ad3bb5c222363dd01057175a/partial_json_parser-0.2.1.1.post4-py3-none-any.whl (9.9 kB)
#10 37.99 Collecting pyzmq (from vllm>=0.4.3->llamafactory==0.8.4.dev0)
#10 38.06   Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ab/68/6fb6ae5551846ad5beca295b7bca32bf0a7ce19f135cb30e55fa2314e6b6/pyzmq-26.2.0-cp311-cp311-manylinux_2_28_x86_64.whl (869 kB)
#10 38.44      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 869.2/869.2 kB 2.4 MB/s eta 0:00:00
#10 38.66 Collecting msgspec (from vllm>=0.4.3->llamafactory==0.8.4.dev0)
#10 38.70   Downloading https://pypi.tuna.tsinghua.edu.cn/packages/6a/73/1b2f991dc26899d2f999c938cbc82c858b3cb7e3ccaad317b32760dbe1da/msgspec-0.18.6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (209 kB)
#10 38.90 Collecting gguf==0.9.1 (from vllm>=0.4.3->llamafactory==0.8.4.dev0)
#10 38.94   Downloading https://pypi.tuna.tsinghua.edu.cn/packages/07/d6/3da29842de7ecd98978779fc07fa1547f8549fdf701cc0d9f84ce1ce0ad8/gguf-0.9.1-py3-none-any.whl (49 kB)
#10 39.10 Collecting importlib-metadata (from vllm>=0.4.3->llamafactory==0.8.4.dev0)
#10 39.13   Downloading https://pypi.tuna.tsinghua.edu.cn/packages/c0/14/362d31bf1076b21e1bcdcb0dc61944822ff263937b804a79231df2774d28/importlib_metadata-8.4.0-py3-none-any.whl (26 kB)
#10 39.19 Collecting mistral-common>=1.3.4 (from vllm>=0.4.3->llamafactory==0.8.4.dev0)
#10 39.26   Downloading https://pypi.tuna.tsinghua.edu.cn/packages/0c/e2/11d7223b4f898ff96f1bcf4ca29765e3e5ff9d9c049a2c640c59534450f1/mistral_common-1.3.4-py3-none-any.whl (3.3 MB)
#10 40.51      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.3/3.3 MB 2.7 MB/s eta 0:00:00
#10 40.83 Collecting ray>=2.9 (from vllm>=0.4.3->llamafactory==0.8.4.dev0)
#10 40.89   Downloading https://pypi.tuna.tsinghua.edu.cn/packages/cd/6a/0416f5f81be11502325623c7adae844d0c518f17cefb6ad0881c4c8a995d/ray-2.35.0-cp311-cp311-manylinux2014_x86_64.whl (65.1 MB)
#10 49.17      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 65.1/65.1 MB 8.0 MB/s eta 0:00:00
#10 49.34 Collecting nvidia-ml-py (from vllm>=0.4.3->llamafactory==0.8.4.dev0)
#10 49.41   Downloading https://pypi.tuna.tsinghua.edu.cn/packages/b7/f3/a69ce0b1a1e12fbf6b2ad9f4c14c9999fdbdf15f2478d210f0fd501ddc98/nvidia_ml_py-12.560.30-py3-none-any.whl (40 kB)
#10 49.56 Collecting torch>=1.10.0 (from accelerate<=0.33.0,>=0.30.1->llamafactory==0.8.4.dev0)
#10 49.60   Downloading https://pypi.tuna.tsinghua.edu.cn/packages/80/83/9b7681e41e59adb6c2b042f7e8eb716515665a6eed3dda4215c6b3385b90/torch-2.4.0-cp311-cp311-manylinux1_x86_64.whl (797.3 MB)
#10 98.02      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 797.3/797.3 MB 17.0 MB/s eta 0:00:00
#10 98.89 Collecting torchvision==0.19 (from vllm>=0.4.3->llamafactory==0.8.4.dev0)
#10 98.92   Downloading https://pypi.tuna.tsinghua.edu.cn/packages/42/1d/efde76f826682ebe6ec97c2874f3c7e4833eb84497c521ce6cfac406ef34/torchvision-0.19.0-cp311-cp311-manylinux1_x86_64.whl (7.0 MB)
#10 99.42      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.0/7.0 MB 16.3 MB/s eta 0:00:00
#10 99.58 Collecting xformers==0.0.27.post2 (from vllm>=0.4.3->llamafactory==0.8.4.dev0)
#10 99.62   Downloading https://pypi.tuna.tsinghua.edu.cn/packages/2e/f1/f4f860c193c1c6d3456b176479097fd9c9890384c1b1f6a15afd9c3f1645/xformers-0.0.27.post2-cp311-cp311-manylinux2014_x86_64.whl (20.8 MB)
#10 101.1      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 20.8/20.8 MB 15.1 MB/s eta 0:00:00
#10 101.1 Collecting vllm-flash-attn==2.6.1 (from vllm>=0.4.3->llamafactory==0.8.4.dev0)
#10 101.2   Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ae/07/a15428870bdda702d5df49743c5b48eddc6b83faa3171aee16d11a6c133f/vllm_flash_attn-2.6.1-cp311-cp311-manylinux1_x86_64.whl (75.9 MB)
#10 106.6      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 75.9/75.9 MB 14.2 MB/s eta 0:00:00
#10 106.7 Collecting interegular>=0.3.2 (from lm-format-enforcer==0.10.6->vllm>=0.4.3->llamafactory==0.8.4.dev0)
#10 106.7   Downloading https://pypi.tuna.tsinghua.edu.cn/packages/c4/01/72d6472f80651673716d1deda2a5bbb633e563ecf94f4479da5519d69d25/interegular-0.3.3-py37-none-any.whl (23 kB)
#10 106.7 Requirement already satisfied: sympy in /opt/miniconda/lib/python3.11/site-packages (from torch>=1.10.0->accelerate<=0.33.0,>=0.30.1->llamafactory==0.8.4.dev0) (1.12)
#10 106.7 Requirement already satisfied: networkx in /opt/miniconda/lib/python3.11/site-packages (from torch>=1.10.0->accelerate<=0.33.0,>=0.30.1->llamafactory==0.8.4.dev0) (3.2.1)
#10 106.8 Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch>=1.10.0->accelerate<=0.33.0,>=0.30.1->llamafactory==0.8.4.dev0)
#10 106.8   Downloading https://pypi.tuna.tsinghua.edu.cn/packages/b6/9f/c64c03f49d6fbc56196664d05dba14e3a561038a81a638eeb47f4d4cfd48/nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
#10 108.8      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 23.7/23.7 MB 11.8 MB/s eta 0:00:00
#10 108.9 Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch>=1.10.0->accelerate<=0.33.0,>=0.30.1->llamafactory==0.8.4.dev0)
#10 108.9   Downloading https://pypi.tuna.tsinghua.edu.cn/packages/eb/d5/c68b1d2cdfcc59e72e8a5949a37ddb22ae6cade80cd4a57a84d4c8b55472/nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
#10 109.1      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 823.6/823.6 kB 21.0 MB/s eta 0:00:00
#10 109.1 Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch>=1.10.0->accelerate<=0.33.0,>=0.30.1->llamafactory==0.8.4.dev0)
#10 109.2   Downloading https://pypi.tuna.tsinghua.edu.cn/packages/7e/00/6b218edd739ecfc60524e585ba8e6b00554dd908de2c9c66c1af3e44e18d/nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)
#10 110.4      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.1/14.1 MB 11.2 MB/s eta 0:00:00
#10 110.5 Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch>=1.10.0->accelerate<=0.33.0,>=0.30.1->llamafactory==0.8.4.dev0)
#10 110.5   Downloading https://pypi.tuna.tsinghua.edu.cn/packages/9f/fd/713452cd72343f682b1c7b9321e23829f00b842ceaedcda96e742ea0b0b3/nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl (664.8 MB)
#10 146.9      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 664.8/664.8 MB 20.0 MB/s eta 0:00:00
#10 147.8 Collecting nvidia-cublas-cu12==12.1.3.1 (from torch>=1.10.0->accelerate<=0.33.0,>=0.30.1->llamafactory==0.8.4.dev0)
#10 147.8   Downloading https://pypi.tuna.tsinghua.edu.cn/packages/37/6d/121efd7382d5b0284239f4ab1fc1590d86d34ed4a4a2fdb13b30ca8e5740/nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl (410.6 MB)
#10 CANCELED

What should I do?

@HardAndHeavy
Copy link
Contributor Author

I have not tried installing VLLM. I think it will require a driver update to run. I will try to create a basic image for the new drivers within a month. The image is currently running on ROCM 6.1, but there is a new version 6.2.

@fyr233
Copy link

fyr233 commented Sep 9, 2024

I have not tried installing VLLM. I think it will require a driver update to run. I will try to create a basic image for the new drivers within a month. The image is currently running on ROCM 6.1, but there is a new version 6.2.

OK, Thanks for reply.

@HardAndHeavy
Copy link
Contributor Author

@fyr233 the ROCm drivers have been updated to version 6.2, and the vLLM installation error has now been fixed. Tested on AMD Radeon RX 7900 XTX.

You need to change the version of the base image.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved This problem has been already solved
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants