llmcompressor.modeling.gemma4
Classes:
-
Gemma4TextExpertsList–Unpacks 3D expert parameter tensors into individual Gemma4TextMLP modules
-
SequentialGemma4TextExperts–Calibration version of Gemma4TextExperts that unpacks experts.
Gemma4TextExpertsList
Bases: ModuleList
Unpacks 3D expert parameter tensors into individual Gemma4TextMLP modules so that each expert's weights are nn.Linear and can be targeted by quantization with targets="Linear".
Source code in src/llmcompressor/modeling/gemma4.py
SequentialGemma4TextExperts
SequentialGemma4TextExperts(
original: Gemma4TextExperts,
config: Gemma4Config,
calibrate_all_experts: bool = True,
)
Bases: MoECalibrationModule
Calibration version of Gemma4TextExperts that unpacks experts.
This module unpacks the packed expert weights (3D -> 2D) for calibration and stays in unpacked form (permanent) for vLLM compatibility.