Why FMHA is not supported in V100 and T4 #320

jiangsongHW · 2023-11-08T09:44:09Z

I'm running TensorRT-LLM on V100, when I enabled fmha with --enable_context_fmha,
I got this error message:
[TensorRT-LLM][ERROR] Assertion failed: Unsupported architecture (/home/build/TensorRT_LLM/TensorRT-LLM-master/cpp/tensorrt_llm/kernels/contextFusedMultiHeadAttention/fmhaRunner.cpp:87)

I checked the code of FusedMHARunnerV2, it seems sm70 and sm75 are not supported.

may I know why V100 is not supported for fmha? or is there any plan on the way?

Thanks!

The text was updated successfully, but these errors were encountered:

ncomly-nvidia · 2023-11-27T17:41:04Z

Hi @jiangsongHW ! Thanks for the request! Some of the techniques in our fMHA aren't supported <sm80, so we would need a different kernel for V100. There are some complexities in transferring the kernel to V100, particularly with FP32 accumulation to preserve accuracy. We may implement a customer <SM80 fMHA in the future, but unlikely in the near term.

We know HW access is an issue with A100 & H100 - the good news is when you do get access the perf / $ will be better than V100!

nv-guomingz · 2024-11-18T06:24:46Z

Hi @jiangsongHW please feel free to reopen this ticket if needed.

juney-nvidia assigned juney-nvidia, jdemouth-nvidia and ncomly-nvidia Nov 8, 2023

juney-nvidia added triaged Issue has been triaged by maintainers feature request New feature or request labels Nov 8, 2023

ncomly-nvidia removed the triaged Issue has been triaged by maintainers label Nov 14, 2023

ncomly-nvidia added the triaged Issue has been triaged by maintainers label Nov 27, 2023

ncomly-nvidia mentioned this issue Dec 11, 2023

TensorRT-LLM Requests #632

Open

41 tasks

nv-guomingz closed this as completed Nov 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why FMHA is not supported in V100 and T4 #320

Why FMHA is not supported in V100 and T4 #320

jiangsongHW commented Nov 8, 2023 •

edited

Loading

ncomly-nvidia commented Nov 27, 2023

nv-guomingz commented Nov 18, 2024

Why FMHA is not supported in V100 and T4 #320

Why FMHA is not supported in V100 and T4 #320

Comments

jiangsongHW commented Nov 8, 2023 • edited Loading

ncomly-nvidia commented Nov 27, 2023

nv-guomingz commented Nov 18, 2024

jiangsongHW commented Nov 8, 2023 •

edited

Loading