`vllm.model_executor.layers.fused_moe.experts` ¶

Modules:

aiter_mxfp4_w4a8_moe –
aiter_mxfp8_moe –

MXFP8 (1x32 block, E8M0) MoE via AITER's FlyDSL two-stage grouped GEMM
batched_deep_gemm_moe –
cpu_int4_moe –

CPU INT4 W4A8 dynamic quantized fused MoE experts.
cpu_moe –

CPU quantized fused MoE experts.
cutlass_moe –

CUTLASS based Fused MoE kernels.
deep_gemm_moe –
fallback –
flashinfer_b12x_moe –
flashinfer_cutedsl_batched_moe –
flashinfer_cutedsl_moe –
flashinfer_cutlass_moe –
fused_batched_moe –

Fused batched MoE kernel.
fused_humming_moe –

Fused MoE utilities for Humming.
gpt_oss_triton_kernels_moe –
lora_context –
lora_experts_mixin –
marlin_moe –

Fused MoE utilities for GPTQ.
mxfp8_emulation_moe –

MXFP8 (1x32 block, E8M0 scale) MoE experts on Triton.
mxfp8_native_moe –

Native MXFP8 (1x32 block, E8M0 scale) MoE for AMD CDNA4 (gfx950) via Triton
nvfp4_emulation_moe –

NVFP4 quantization emulation for MoE.
ocp_mx_emulation_moe –

OCP MX quantization emulation for MoE.
rocm_aiter_moe –
triton_cutlass_moe –
triton_deep_gemm_moe –
triton_moe –

Triton-based MoE expert implementations.
trtllm_bf16_moe –
trtllm_fp8_moe –
trtllm_mxfp4_moe –
trtllm_mxint4_moe –
trtllm_nvfp4_moe –
xpu_moe –

vllm.model_executor.layers.fused_moe.experts ¶