vllm.model_executor.layers.fused_moe.experts.aiter_mxfp8_moe ¶
MXFP8 (1x32 block, E8M0) MoE via AITER's FlyDSL two-stage grouped GEMM (gfx950); alternative to Mxfp8NativeTritonExperts. Routes through aiter.fused_moe (per_1x32, gate_mode=INTERLEAVE); weights are preshuffled in convert_to_fp8_moe_kernel_format.
Classes:
-
AiterMxfp8Experts–MXFP8 MoE through AITER's FlyDSL two-stage grouped GEMM (gfx950).
Functions:
-
is_aiter_mxfp8_moe_available–True when the FlyDSL MXFP8 MoE can run here: gfx950, the
flydsl
AiterMxfp8Experts ¶
Bases: Mxfp8TritonExpertsBase
MXFP8 MoE through AITER's FlyDSL two-stage grouped GEMM (gfx950).
Source code in vllm/model_executor/layers/fused_moe/experts/aiter_mxfp8_moe.py
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 | |
is_aiter_mxfp8_moe_available() ¶
True when the FlyDSL MXFP8 MoE can run here: gfx950, the flydsl package is importable, AND the installed aiter carries the mxfp8 FlyDSL 2-stage support from ROCm/aiter#3811.
flydsl and aiter are separate packages, so is_flydsl_available() (flydsl pkg + arch) is necessary but not sufficient: an older aiter without
3811 still ships the flydsl pkg and the aiter.ops.flydsl module but a¶
broken/missing per_1x32 + fp8 2-stage path. Without this extra gate a nightly lacking #3811 would wrongly select FlyDSL instead of falling back to the native Triton dot_scaled path. #3811 added no probe-able public symbol, so detect the minimax_m3_mxfp8 tuned config it shipped. Every check fails closed (returns False -> triton dot_scaled), which is always safe.