Skip to content

Commit

Permalink
fix moe-ep accuracy issue for fp8 (#2489)
Browse files Browse the repository at this point in the history
  • Loading branch information
xiaobochen123 authored Dec 16, 2024
1 parent a0592c0 commit b532a5f
Showing 1 changed file with 4 additions and 0 deletions.
4 changes: 4 additions & 0 deletions python/sglang/srt/layers/ep_moe/layer.py
Original file line number Diff line number Diff line change
Expand Up @@ -644,6 +644,10 @@ def process_weights_after_loading(self, layer: Module) -> None:
"QuantConfig has static quantization, but found "
"activation scales are None."
)
layer.w13_weight_scale = torch.nn.Parameter(
torch.max(layer.w13_weight_scale, dim=1).values,
requires_grad=False,
)
return

def apply(
Expand Down

0 comments on commit b532a5f

Please sign in to comment.