Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more support for intel Gaudi accelerators #2357

Merged
merged 5 commits into from
Dec 6, 2024

Conversation

YangQun1
Copy link
Contributor

@YangQun1 YangQun1 commented Dec 5, 2024

Motivation

We already have initial support for Intel Gaudi accelerators in sglang.
This PR aims to add more supports (like hpu memory capacity getter and some minor changes to use generic torch device module apis) to make the offline_batch_inference.py example run e2e successfully on x1 or x2 Gaudi2 cards.

# x1
python examples/runtime/engine/offline_batch_inference.py --device hpu --model-path meta-llama/Meta-Llama-3.1-8B-Instruct

# x2
python examples/runtime/engine/offline_batch_inference.py --device hpu --tp-size=2 --model-path meta-llama/Meta-Llama-3.1-8B-Instruct

Modifications

Checklist

  • Format your code according to the Contributor Guide.
  • Add unit tests as outlined in the Contributor Guide.
  • Update documentation as needed, including docstrings or example tutorials.

@YangQun1 YangQun1 marked this pull request as ready for review December 5, 2024 06:57
@YangQun1
Copy link
Contributor Author

YangQun1 commented Dec 5, 2024

Hi @merrymercy , could you help to take a review?

Copy link
Contributor

@merrymercy merrymercy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good!

@merrymercy merrymercy enabled auto-merge (squash) December 6, 2024 09:15
@merrymercy merrymercy disabled auto-merge December 6, 2024 09:15
@merrymercy merrymercy merged commit 37ee906 into sgl-project:main Dec 6, 2024
0 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants