-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use CUDA 12.4 as default for release and nightly wheels #12098
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: mgoin <[email protected]>
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree! there are several pieces we pick up going from 12.1 -> 12.4 (Lovelace fp8 kernels, 2:4 sparse kernels, some cuda graph stuff)
+1 to this as well |
@@ -37,7 +49,7 @@ steps: | |||
queue: cpu_queue_postmerge | |||
commands: | |||
- "aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws/q9t5s3a7" | |||
- "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=12.1.0 --tag public.ecr.aws/q9t5s3a7/vllm-release-repo:$BUILDKITE_COMMIT --target vllm-openai --progress plain ." | |||
- "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=12.4.0 --tag public.ecr.aws/q9t5s3a7/vllm-release-repo:$BUILDKITE_COMMIT --target vllm-openai --progress plain ." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
both "Build wheel - CUDA 12.4" and "Build wheel - CUDA 12.1" build with cuda 12.4?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^ this should stay 12.1.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this the release image? I have not changed the "Build wheel - CUDA 12.1" case. Why shouldn't this also be 12.4?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please note that, in .buildkite/upload-wheels.sh
, we do not upload cuda 11.8 wheels, but will upload cuda 12 wheels.
if you build for both 12.1 and 12.4, make sure only one version is uploaded to avoid upload race condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Triggered a test run here: https://buildkite.com/vllm/release/builds/2643
and maybe it's time we stop building cu118 wheels?
@@ -37,7 +49,7 @@ steps: | |||
queue: cpu_queue_postmerge | |||
commands: | |||
- "aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws/q9t5s3a7" | |||
- "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=12.1.0 --tag public.ecr.aws/q9t5s3a7/vllm-release-repo:$BUILDKITE_COMMIT --target vllm-openai --progress plain ." | |||
- "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=12.4.0 --tag public.ecr.aws/q9t5s3a7/vllm-release-repo:$BUILDKITE_COMMIT --target vllm-openai --progress plain ." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^ this should stay 12.1.0
I thought that wheels built that don't match the Line 49 in 54cacf0
So I think would prevent wheel collision. It would be nice to have explicit wheel versions like this to make it more transparent what is available |
@mgoin the upload wheel only checks if cu118 is present 🤕 vllm/.buildkite/upload-wheels.sh Line 50 in b5b57e3
|
Looks like PyTorch has 11.8, 12.1, and 12.4. So to confirm, we will publish 12.4 to PyPI, still build for 12.1 and 11.8 for nightly, then push them to artifacts? |
Signed-off-by: mgoin <[email protected]>
@khluu could you help give this a test again? |
@@ -1,3 +1,15 @@ | |||
steps: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there should only be 1 steps:
can we move this whole step to under the steps:
key at line 13
I think it is time to match pytorch's default cuda version. We should still keep wheels built with 12.1 around.