llmcompressor.modifiers.transform.smoothquant.dynamic_mappings
Dynamic SmoothQuant mapping builders for architectures that need model-aware logic.
Functions:
-
get_layer_mappings_from_model–Infer SmoothQuant mappings from a model.
build_qwen3_5_dense_smoothquant_mappings
Build SmoothQuant mappings for dense Qwen3.5 hybrid-attention models.
Dense Qwen3.5 variants expose a regular mlp.gate_proj/mlp.up_proj
pair instead of the MoE shared_expert submodule.
Source code in src/llmcompressor/modifiers/transform/smoothquant/dynamic_mappings.py
build_qwen3_5_moe_smoothquant_mappings
Build SmoothQuant mappings for Qwen3.5 MoE hybrid-attention models.
Only full-attention layers expose self_attn q/k/v projections, so the input layernorm regex must be restricted to those layer indices. The shared expert MLP remains safe to smooth with the standard post-attention layernorm mapping.
Source code in src/llmcompressor/modifiers/transform/smoothquant/dynamic_mappings.py
get_layer_mappings_from_model
Infer SmoothQuant mappings from a model.
Checks the dynamic mapping registry first for model-aware builders, then falls back to the static architecture registry, then to the default mappings.
Parameters:
-
model(Module) –model instance used to infer mappings
Returns:
-
list[LayerMap]–list of SmoothQuant LayerMap entries for the model