Skip to content

Commit

Permalink
finish #1808 cherry-pick, adjust interface
Browse files Browse the repository at this point in the history
Summary:
# Why

- some kernels were missed in the previous cherry-pick
- the interface has changed slightly and extracting the `intermediate_size` is different

# What

- get remaining files from ROCm/composable_kernel@1ff50e7
- adjust interface
- explicitly pass silu as the activation function

Differential Revision: D68792360
  • Loading branch information
coconutruben authored and facebook-github-bot committed Jan 28, 2025
1 parent 73f6cee commit 9d90119
Showing 1 changed file with 5 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,10 @@ at::Tensor fused_moe_impl(
auto tokens = input.size(0);
auto hidden_size = input.size(1);
auto experts = gate_up_weight.size(0);
auto intermediate_size = gate_up_weight.size(1);
// Interface requires that you pass intermediate size. On |gate_only| = False,
// |gate_up_weight| might be 2 * intermediate size, so extract the size from
// |down_weight|
auto intermediate_size = down_weight.size(2);
auto topk = topk_ids.size(1);
auto stride = input.stride(0);

Expand Down Expand Up @@ -81,6 +84,7 @@ at::Tensor fused_moe_impl(
"fp32", // prec_sq (smooth quant)
"fp32", // prec_kw (topk weight)
static_cast<int>(block_m),
1,
static_cast<int>(gate_only),
static_cast<int>(fused_quant)};

Expand Down

0 comments on commit 9d90119

Please sign in to comment.