vllm_omni.diffusion.models.hunyuan_image3.hunyuan_fused_moe ¶
HunyuanFusedMoE ¶
HunyuanFusedMoEDefault ¶
Adapter that configures the upstream FusedMoE MoERunner for HunyuanImage3.
Upstream commit dc68bd8c41 refactored FusedMoE from a class (nn.Module) into a factory function that returns a MoERunner instance, whose expert weights live in a routed_experts submodule (...experts.routed_experts.w13_weight / ...w2_weight).
This adapter builds that runner, installs the omni-specific forward-context setup and one-shot kernel-initialisation hook, and returns the runner directly from __new__ so the parent MoE block registers it as a real nn.Module submodule.
Returning the runner (rather than wrapping it in a plain object that holds it in an attribute) is required for correctness: a non-Module wrapper hides the runner's parameters from named_parameters(), so load_weights cannot find ...experts.routed_experts.w13_weight and raises KeyError during weight loading.
make_expert_params_mapping staticmethod ¶
make_expert_params_mapping(
model: Any,
ckpt_gate_proj_name: str,
ckpt_down_proj_name: str,
ckpt_up_proj_name: str,
num_experts: int,
num_redundant_experts: int = 0,
) -> list[tuple[str, str, int, str]]
Delegate to the upstream standalone function.
Upstream vLLM refactored FusedMoE from a class (which had make_expert_params_mapping as a classmethod) to a factory function. The method was moved to a standalone function fused_moe_make_expert_params_mapping in vllm.model_executor.layers.fused_moe.