Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix moe-ep accuracy issue for fp8 #2489

Merged
merged 1 commit into from
Dec 16, 2024
Merged

Conversation

xiaobochen123
Copy link
Contributor

Motivation

fix moe ep bug, when load fp8 model. Links to related issues link

Test model : neuralmagic/DeepSeek-Coder-V2-Instruct-FP8

Accuracy: 0.932
Invalid: 0.000
Latency: 243.824 s
Output throughput: 1027.530 token/s

cc: @ispobock

Modifications

Checklist

  • Format your code according to the Contributor Guide.
  • Add unit tests as outlined in the Contributor Guide.
  • Update documentation as needed, including docstrings or example tutorials.

@zhyncs zhyncs changed the title fix moe-ep bug fix moe-ep accuracy issue for fp8 Dec 16, 2024
@zhyncs zhyncs merged commit b532a5f into sgl-project:main Dec 16, 2024
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants