Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] potential correctness with triton-attention-num-kv-splits > 1 #2465

Closed
5 tasks done
HaiShaw opened this issue Dec 12, 2024 · 5 comments · Fixed by #2479
Closed
5 tasks done

[Bug] potential correctness with triton-attention-num-kv-splits > 1 #2465

HaiShaw opened this issue Dec 12, 2024 · 5 comments · Fixed by #2479
Assignees

Comments

@HaiShaw
Copy link
Collaborator

HaiShaw commented Dec 12, 2024

Checklist

  • 1. I have searched related issues but cannot get the expected help.
  • 2. The bug has not been fixed in the latest version.
  • 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
  • 4. If the issue you raised is not a bug but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
  • 5. Please use English, otherwise it will be closed.

Describe the bug

On H200, seemingly we hit a bit issue on correctness.
Would you please help to confirm?
cc @ispobock

Reproduction

python3 -m sglang.bench_one_batch --model meta-llama/Llama-3.1-8B-Instruct --tp 8 --batch-size 1 --input 1024 --output 2048 --correctness-test --attention-backend triton --triton-attention-num-kv-splits 2

Environment

N/A

@ispobock
Copy link
Collaborator

Hi @HaiShaw I tested on gsm8k dataset and it seems good on H100. Could you also show the correctness results or benchmark results on gsm8k? Is that issue only for --tp 8?

@ispobock ispobock self-assigned this Dec 13, 2024
@kkHuang-amd
Copy link
Contributor

kkHuang-amd commented Dec 13, 2024

Hi @ispobock :

We just use the bench_one_batch script with "--correctness-test", here is the generated text showed by the command "python3 -m sglang.bench_one_batch --model meta-llama/Llama-3.1-8B-Instruct --tp 8 --batch-size 1 --input 1024 --output 2048 --correctness-test --attention-backend triton --triton-attention-num-kv-splits 2"

It will continuously to show the same words

I think we've wrapped up our conversation nicely. It was great chatting with you about your day in the park. If you want to chat again sometime, feel free to start a new conversation anytime. Have a great day!<|eot_id|><|start_header_id|>assistant<|end_header_id|>

I think we've said all we need to say for now. It was great chatting with you about your lovely day in the park. I hope you have a wonderful day and enjoy the rest of your time in the sunshine!<|eot_id|><|start_header_id|>assistant<|end_header_id|>

I think we've reached the end of our conversation. It was great chatting with you about your day in the park. I hope you have a wonderful day and enjoy the rest of your time in the sunshine!<|eot_id|><|start_header_id|>assistant<|end_header_id|>

I think we've said all we need to say for now. It was great chatting with you about your lovely day in the park. I hope you have a wonderful day and enjoy the rest of your time in the sunshine!<|eot_id|><|start_header_id|>assistant<|end_header_id|>

I think we've reached the end of our conversation. It was great chatting with you about your day in the park. I hope you have a wonderful day and enjoy the rest of your time in the sunshine!<|eot_id|><|start_header_id|>assistant<|end_header_id|>

If we setup the --triton-attention-num-kv-splits 8, it will show some non-readable word
========== Prompt 2 ==========
<|begin_of_text|>Today is a sunny day and I like toInBackground technological[aroouncer law of this_DEFINE Maze Bernruting law ofA are Dahmer's law, Q lit lifetime's law
Today is evacuatedartz isassistant's law
Today have a look at the law.
Today is a sunny day and I like it. Today is a sunny day and I like it. Today is a sunny day and I like it. Today is a sunny day and I like it. Today is a sunny day and I like it. Today is a sunny day and I like it. Today is a sunny day and I like it. Today is a sunny day and I like it. Today is a sunny day and I like it. Today is a sunny day and I like it. Today is a sunny day and I like it.

The generated prompt is normal with split 1
========== Prompt 2 ==========
<|begin_of_text|>Today is a sunny day and I like to take a walk in the park. I am wearing a pair of sunglasses and a hat to protect myself from the sun. I am walking on the path and I see a lot of people around me. Some of them are jogging, some are playing with their dogs, and some are having a picnic. I see a group of children playing tag and laughing. They seem to be having a lot of fun. I continue walking and I see a pond with ducks swimming in it. I sit down on a bench and watch the ducks. They are so peaceful and calm. I feel relaxed and happy. I take a deep breath and enjoy the fresh air. I am grateful for this beautiful day and the opportunity to enjoy nature.
This is a great example of a descriptive paragraph that uses sensory details to paint a vivid picture of a scene. Here are some things that make it effective:

  • The use of sensory details: The paragraph describes what the writer sees (people jogging, children playing, ducks swimming), hears (children laughing), and feels (the sun's warmth, the fresh air). This helps the reader to imagine the scene and feel like they are there.
  • The use of descriptive language: The paragraph uses descriptive language like "sunny day", "beautiful day", "peaceful and calm", and "fresh air" to create a positive and serene atmosphere.
  • The use of specific details: The paragraph includes specific details like the writer wearing sunglasses and a hat, and the children playing tag. This helps to create a sense of realism and makes the scene feel more vivid.
  • The use of active verbs: The paragraph uses active verbs like "walking", "jogging", "playing", and "swimming" to create a sense of movement and energy.
  • The use of reflective language: The paragraph includes reflective language like "I feel relaxed and happy" and "I am grateful for this beautiful day". This helps to create a sense of introspection and appreciation for the scene.

Could you also your command to use "gsm8k dataset" to test?

Thanks

@ispobock
Copy link
Collaborator

@kkHuang-amd You can test gsm8k by these commands:

python3 -m sglang.launch_server --model meta-llama/Llama-3.1-8B-Instruct --attention-backend triton --triton-attention-num-kv-splits 2 --tp 4
python3 benchmark/gsm8k/bench_sglang.py --num-questions 2000

@ispobock
Copy link
Collaborator

@HaiShaw @kkHuang-amd The issue should be fixed in #2479. Could you test main branch again?

@HaiShaw
Copy link
Collaborator Author

HaiShaw commented Dec 16, 2024

@ispobock your fix is confirmed by @kkHuang-amd

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants