Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why FMHA is not supported in V100 and T4 #320

Closed
jiangsongHW opened this issue Nov 8, 2023 · 2 comments
Closed

Why FMHA is not supported in V100 and T4 #320

jiangsongHW opened this issue Nov 8, 2023 · 2 comments
Assignees
Labels
feature request New feature or request triaged Issue has been triaged by maintainers

Comments

@jiangsongHW
Copy link

jiangsongHW commented Nov 8, 2023

I'm running TensorRT-LLM on V100, when I enabled fmha with --enable_context_fmha,
I got this error message:
[TensorRT-LLM][ERROR] Assertion failed: Unsupported architecture (/home/build/TensorRT_LLM/TensorRT-LLM-master/cpp/tensorrt_llm/kernels/contextFusedMultiHeadAttention/fmhaRunner.cpp:87)

I checked the code of FusedMHARunnerV2, it seems sm70 and sm75 are not supported.

may I know why V100 is not supported for fmha? or is there any plan on the way?

Thanks!

@juney-nvidia juney-nvidia added triaged Issue has been triaged by maintainers feature request New feature or request labels Nov 8, 2023
@ncomly-nvidia ncomly-nvidia removed the triaged Issue has been triaged by maintainers label Nov 14, 2023
@ncomly-nvidia ncomly-nvidia added the triaged Issue has been triaged by maintainers label Nov 27, 2023
@ncomly-nvidia
Copy link
Collaborator

Hi @jiangsongHW ! Thanks for the request! Some of the techniques in our fMHA aren't supported <sm80, so we would need a different kernel for V100. There are some complexities in transferring the kernel to V100, particularly with FP32 accumulation to preserve accuracy. We may implement a customer <SM80 fMHA in the future, but unlikely in the near term.

We know HW access is an issue with A100 & H100 - the good news is when you do get access the perf / $ will be better than V100!

@nv-guomingz
Copy link
Collaborator

Hi @jiangsongHW please feel free to reopen this ticket if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

5 participants