Skip to content

vllm.model_executor.layers.fused_moe.experts

Modules:

Name Description
aiter_mxfp4_w4a8_moe
batched_deep_gemm_moe
cpu_moe

CPU FP8 W8A16 block-quantized fused MoE experts.

cutlass_moe

CUTLASS based Fused MoE kernels.

deep_gemm_moe
flashinfer_cutedsl_batched_moe
flashinfer_cutedsl_moe
flashinfer_cutlass_moe
gpt_oss_triton_kernels_moe
marlin_moe

Fused MoE utilities for GPTQ.

nvfp4_emulation_moe

NVFP4 quantization emulation for MoE.

ocp_mx_emulation_moe

OCP MX quantization emulation for MoE.

rocm_aiter_moe
triton_moe

Triton-based MoE expert implementations.

trtllm_bf16_moe
trtllm_fp8_moe
trtllm_mxfp4_moe
trtllm_nvfp4_moe