[Bug] potential correctness with triton-attention-num-kv-splits > 1 #2465

HaiShaw · 2024-12-12T09:10:12Z

Checklist

1. I have searched related issues but cannot get the expected help.
2. The bug has not been fixed in the latest version.
3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
4. If the issue you raised is not a bug but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
5. Please use English, otherwise it will be closed.

Describe the bug

On H200, seemingly we hit a bit issue on correctness.
Would you please help to confirm?
cc @ispobock

Reproduction

python3 -m sglang.bench_one_batch --model meta-llama/Llama-3.1-8B-Instruct --tp 8 --batch-size 1 --input 1024 --output 2048 --correctness-test --attention-backend triton --triton-attention-num-kv-splits 2

Environment

N/A

The text was updated successfully, but these errors were encountered:

ispobock · 2024-12-13T00:03:28Z

Hi @HaiShaw I tested on gsm8k dataset and it seems good on H100. Could you also show the correctness results or benchmark results on gsm8k? Is that issue only for --tp 8?

kkHuang-amd · 2024-12-13T01:56:26Z

Hi @ispobock :

We just use the bench_one_batch script with "--correctness-test", here is the generated text showed by the command "python3 -m sglang.bench_one_batch --model meta-llama/Llama-3.1-8B-Instruct --tp 8 --batch-size 1 --input 1024 --output 2048 --correctness-test --attention-backend triton --triton-attention-num-kv-splits 2"

It will continuously to show the same words

If we setup the --triton-attention-num-kv-splits 8, it will show some non-readable word
========== Prompt 2 ==========
<|begin_of_text|>Today is a sunny day and I like toInBackground technological[aroouncer law of this_DEFINE Maze Bernruting law ofA are Dahmer's law, Q lit lifetime's law
Today is evacuatedartz isassistant's law
Today have a look at the law.
Today is a sunny day and I like it. Today is a sunny day and I like it. Today is a sunny day and I like it. Today is a sunny day and I like it. Today is a sunny day and I like it. Today is a sunny day and I like it. Today is a sunny day and I like it. Today is a sunny day and I like it. Today is a sunny day and I like it. Today is a sunny day and I like it. Today is a sunny day and I like it.

The generated prompt is normal with split 1
========== Prompt 2 ==========
<|begin_of_text|>Today is a sunny day and I like to take a walk in the park. I am wearing a pair of sunglasses and a hat to protect myself from the sun. I am walking on the path and I see a lot of people around me. Some of them are jogging, some are playing with their dogs, and some are having a picnic. I see a group of children playing tag and laughing. They seem to be having a lot of fun. I continue walking and I see a pond with ducks swimming in it. I sit down on a bench and watch the ducks. They are so peaceful and calm. I feel relaxed and happy. I take a deep breath and enjoy the fresh air. I am grateful for this beautiful day and the opportunity to enjoy nature.
This is a great example of a descriptive paragraph that uses sensory details to paint a vivid picture of a scene. Here are some things that make it effective:

The use of sensory details: The paragraph describes what the writer sees (people jogging, children playing, ducks swimming), hears (children laughing), and feels (the sun's warmth, the fresh air). This helps the reader to imagine the scene and feel like they are there.
The use of descriptive language: The paragraph uses descriptive language like "sunny day", "beautiful day", "peaceful and calm", and "fresh air" to create a positive and serene atmosphere.
The use of specific details: The paragraph includes specific details like the writer wearing sunglasses and a hat, and the children playing tag. This helps to create a sense of realism and makes the scene feel more vivid.
The use of active verbs: The paragraph uses active verbs like "walking", "jogging", "playing", and "swimming" to create a sense of movement and energy.
The use of reflective language: The paragraph includes reflective language like "I feel relaxed and happy" and "I am grateful for this beautiful day". This helps to create a sense of introspection and appreciation for the scene.

Could you also your command to use "gsm8k dataset" to test?

Thanks

ispobock · 2024-12-14T02:35:00Z

@kkHuang-amd You can test gsm8k by these commands:

python3 -m sglang.launch_server --model meta-llama/Llama-3.1-8B-Instruct --attention-backend triton --triton-attention-num-kv-splits 2 --tp 4
python3 benchmark/gsm8k/bench_sglang.py --num-questions 2000

ispobock · 2024-12-14T08:52:41Z

@HaiShaw @kkHuang-amd The issue should be fixed in #2479. Could you test main branch again?

HaiShaw · 2024-12-16T02:36:25Z

@ispobock your fix is confirmed by @kkHuang-amd

ispobock self-assigned this Dec 13, 2024

ispobock mentioned this issue Dec 14, 2024

Fix correctness issue for triton decoding kernel #2479

Merged

zhyncs closed this as completed in #2479 Dec 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] potential correctness with triton-attention-num-kv-splits > 1 #2465

[Bug] potential correctness with triton-attention-num-kv-splits > 1 #2465

HaiShaw commented Dec 12, 2024

ispobock commented Dec 13, 2024

kkHuang-amd commented Dec 13, 2024 •

edited

Loading

ispobock commented Dec 14, 2024

ispobock commented Dec 14, 2024

HaiShaw commented Dec 16, 2024

[Bug] potential correctness with triton-attention-num-kv-splits > 1 #2465

[Bug] potential correctness with triton-attention-num-kv-splits > 1 #2465

Comments

HaiShaw commented Dec 12, 2024

Checklist

Describe the bug

Reproduction

Environment

ispobock commented Dec 13, 2024

kkHuang-amd commented Dec 13, 2024 • edited Loading

ispobock commented Dec 14, 2024

ispobock commented Dec 14, 2024

HaiShaw commented Dec 16, 2024

kkHuang-amd commented Dec 13, 2024 •

edited

Loading