llmcompressor.modifiers.transform.imatrix.base

Classes:

IMatrixGatherer –

Lifecycle trigger for iMatrix importance collection.

IMatrixGatherer

Bases: Modifier

Lifecycle trigger for iMatrix importance collection.

Triggers a calibration pass so that IMatrixMSEObserver can collect E[x²] via its attach() hook. Does not quantize weights — the actual quantization is done by the subsequent QuantizationModifier / GPTQModifier.

The observer's detach() method leaves raw _imatrix_sum and _imatrix_count on the module for the next quantization pass observer to pick up via attach().

Example recipe::

recipe:
  - IMatrixGatherer:
      ignore: ["lm_head"]
  - QuantizationModifier:
      config_groups:
        group_0:
          targets: ["Linear"]
          weights:
            observer: imatrix_mse

Or composed with GPTQ::

recipe:
  - IMatrixGatherer:
      ignore: ["lm_head"]
  - GPTQModifier:
      config_groups:
        group_0:
          targets: ["Linear"]
          weights:
            observer: imatrix_mse

.. note:: Auto-prepend (inserting the gatherer automatically when imatrix_mse is detected in a recipe) is planned for a follow-up PR.

Parameters:

targets –

module types to instrument (default: ["Linear"])
ignore –

layer name patterns to skip (default: ["lm_head"])
weight_observer –

observer to attach during calibration. Must be "imatrix_mse" (default).

Methods:

on_finalize –

Clean up any remaining accumulators so they don't end up in the checkpoint
on_initialize –

Attach iMatrix observers to target modules for E[x²] collection

on_finalize

on_finalize(state: State, **kwargs) -> bool

Clean up any remaining accumulators so they don't end up in the checkpoint

Source code in src/llmcompressor/modifiers/transform/imatrix/base.py

def on_finalize(self, state: State, **kwargs) -> bool:
    """
    Clean up any remaining accumulators so they don't end up in the checkpoint
    """
    if not self.ended_:
        self.on_end(state, None)

    for _, module in match_named_modules(
        state.model, self._resolved_targets, self.ignore
    ):
        for attr in ("_imatrix_sum", "_imatrix_count"):
            if hasattr(module, attr):
                delattr(module, attr)

    return True

on_initialize

on_initialize(state: State, **kwargs) -> bool

Attach iMatrix observers to target modules for E[x²] collection

Source code in src/llmcompressor/modifiers/transform/imatrix/base.py

def on_initialize(self, state: State, **kwargs) -> bool:
    """
    Attach iMatrix observers to target modules for E[x²] collection
    """
    self._resolved_targets = (
        self.targets if isinstance(self.targets, list) else [self.targets]
    )

    # Minimal QuantizationArgs — only used to instantiate the observer,
    # no quantization config is applied to the model.
    observer_args = QuantizationArgs(observer=self.weight_observer)

    for _, module in match_named_modules(
        state.model, self._resolved_targets, self.ignore
    ):
        observer = Observer.load_from_registry(
            self.weight_observer,
            base_name="weight",
            args=observer_args,
        )
        module.register_module("weight_observer", observer)
        observer.attach(module)

    return True