Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix gptq for moe layers #2300

Merged
merged 5 commits into from
Dec 3, 2024
Merged

Fix gptq for moe layers #2300

merged 5 commits into from
Dec 3, 2024

Conversation

merrymercy
Copy link
Contributor

@merrymercy merrymercy commented Dec 1, 2024

Try to fix #2117 #2270

We can run python3 -m sglang.launch_server --model TheBloke/Mixtral-8x7B-v0.1-GPTQ with vllm's fused moe layer, but cannot run it with sglang's fused moe layer.
We should probably add this model to nightly eval.

Test cases

python3 -m sglang.launch_server --model TheBloke/Mixtral-8x7B-v0.1-GPTQ
python3 -m sglang.launch_server --model casperhansen/deepseek-coder-v2-instruct-awq --trust-remote-code --tp 2

@zhyncs
Copy link
Member

zhyncs commented Dec 1, 2024

@zhyncs
Copy link
Member

zhyncs commented Dec 2, 2024

also cc @HandH1998 @ispobock

@zhyncs
Copy link
Member

zhyncs commented Dec 3, 2024

python3 -m sglang.launch_server --model TheBloke/Mixtral-8x7B-v0.1-GPTQ
python3 -m sglang.launch_server --model casperhansen/deepseek-coder-v2-instruct-awq --trust-remote-code --tp 2 --disable-mla

@zhyncs
Copy link
Member

zhyncs commented Dec 3, 2024

@zhyncs
Copy link
Member

zhyncs commented Dec 3, 2024

follow-up PRs (may after v0.4)

@zhyncs zhyncs merged commit 1228f7c into main Dec 3, 2024
17 of 18 checks passed
@zhyncs zhyncs deleted the pr-fix-gptq-moe branch December 3, 2024 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] Unable to load GPTQ Mixtral 8x7 v0.1 with SGLang
3 participants