llmcompressor.modeling.glm_moe_dsa
Classes:
-
CalibrationGlmMoeDsaMoE–Calibration version of GlmMoeDsaMoE that unpacks experts for sequential
CalibrationGlmMoeDsaMoE
CalibrationGlmMoeDsaMoE(
original: GlmMoeDsaMoE,
config: GlmMoeDsaConfig,
calibrate_all_experts: bool = True,
)
Bases: MoECalibrationModule
Calibration version of GlmMoeDsaMoE that unpacks experts for sequential processing.
This module: 1. Unpacks the packed expert weights (3D -> 2D) for calibration 2. Optionally sends all tokens to all experts during calibration 3. Stays in unpacked form (permanent) for vLLM compatibility
Subclasses (e.g. :class:CalibrationGlm4MoeLiteMoE) override
:meth:_get_num_experts and :meth:_make_experts to handle
model-specific config fields and MLP classes, while inheriting the
shared routing and forward logic.