New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Remove cuda graph batch size adjustment for dp attention #2484

Merged

ispobock merged 1 commit into sgl-project:main from ispobock:remove-cg-adjust

Dec 14, 2024

Collaborator

ispobock commented Dec 14, 2024

Motivation

Currently, the cuda graph can take less cuda memory for Triton attention backend. So 128 cuda graph batch size works fine for dp attention.


          remove cuda graph bs adjust

42ea9d0

ispobock requested review from merrymercy, Ying1123, hnyls2002, zhyncs and ByronHsu as code owners

December 14, 2024 15:04

ispobock enabled auto-merge (squash)

December 14, 2024 15:16

zhyncs approved these changes

View reviewed changes

ispobock merged commit 0ba2c58 into sgl-project:main

15 checks passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

zhyncs zhyncs approved these changes

merrymercy Awaiting requested review from merrymercy merrymercy is a code owner

Ying1123 Awaiting requested review from Ying1123 Ying1123 is a code owner

hnyls2002 Awaiting requested review from hnyls2002 hnyls2002 is a code owner

ByronHsu Awaiting requested review from ByronHsu ByronHsu is a code owner

Labels

None yet