[Bug] Accuracy is abnormal when EP MoE is enabled #2482

ispobock · 2024-12-14T14:13:15Z

Checklist

1. I have searched related issues but cannot get the expected help.
2. The bug has not been fixed in the latest version.
3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
4. If the issue you raised is not a bug but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
5. Please use English, otherwise it will be closed.

Describe the bug

Accuracy on gsm8k dataset is decreased for EP MoE.
cc: @xiaobochen123

Reproduction

EP:

python3 -m sglang.launch_server --model-path neuralmagic/DeepSeek-Coder-V2-Instruct-FP8 --disable-radix-cache --trust-remote-code --tp 8 --enable-ep-moe --disable-cuda-graph
python3 benchmark/gsm8k/bench_sglang.py --num-questions 1400 --parallel 1400

Accuracy: 0.540
Invalid: 0.005
Latency: 205.758 s
Output throughput: 1017.681 token/s

TP:

python3 -m sglang.launch_server --model-path neuralmagic/DeepSeek-Coder-V2-Instruct-FP8 --disable-radix-cache --trust-remote-code --tp 8 --disable-cuda-graph
python3 benchmark/gsm8k/bench_sglang.py --num-questions 1400 --parallel 1400

Accuracy: 0.930
Invalid: 0.000
Latency: 196.344 s
Output throughput: 1011.191 token/s

Environment

sglang: main branch (0.4.0.post1)
torch: 2.5.1
triton: 3.1.0

The text was updated successfully, but these errors were encountered:

xiaobochen123 mentioned this issue Dec 16, 2024

fix moe-ep accuracy issue for fp8 #2489

Merged

3 tasks

ispobock closed this as completed Dec 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Accuracy is abnormal when EP MoE is enabled #2482

[Bug] Accuracy is abnormal when EP MoE is enabled #2482

ispobock commented Dec 14, 2024 •

edited

Loading

[Bug] Accuracy is abnormal when EP MoE is enabled #2482

[Bug] Accuracy is abnormal when EP MoE is enabled #2482

Comments

ispobock commented Dec 14, 2024 • edited Loading

Checklist

Describe the bug

Reproduction

Environment

ispobock commented Dec 14, 2024 •

edited

Loading