Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault when running tritonbench flash attention with --causal #18

Open
yjk21 opened this issue Dec 23, 2024 · 1 comment
Open
Labels
bug Something isn't working

Comments

@yjk21
Copy link

yjk21 commented Dec 23, 2024

Describe the bug

I'm running the benchmarking command from the ws branch, but added the --causal flag, i.e.:

TORCH_CUDA_ARCH_LIST=9.0a cuda-gdb --args python run.py --op flash_attention --only triton_tutorial_flash_v2_ws,triton_tutorial_flash_v2_tma_ws,triton_tutorial_flash_v2 --num-inputs 1 --seq-len 4096 --metrics tflops --batch 8 --n-heads 16 --d-head 128 --causal

I'm seeing a segfault here:

Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x00007d54b9f6b32e in mlir::detail::IROperandBase::insertInto<mlir::IRObjectWithUseList<mlir::OpOperand> > (useList=0x5b34c5d18750, this=0x5b34c5cef190) at /root/.triton/llvm/llvm-b5cc222d-ubuntu-x64/include/mlir/IR/UseDefLists.h:101
101           nextUse->back = &nextUse;

Without the flag it looks WAI.

Environment details

Tritonbench at 3a5dccb159834968567a2e45e561dc1aeaa8f8a8
Meta triton at 67f51cc1420cabeb6bf4d28c1813e38ea9a92e20

@yjk21 yjk21 added the bug Something isn't working label Dec 23, 2024
@htyu
Copy link
Contributor

htyu commented Jan 7, 2025

Thanks for reporting the issue. We will take a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants