llmcompressor.utils.pytorch

Modules:

module –

Utility / helper functions

Functions:

build_parameterized_layers –

Build ModelParameterizedLayer objects for modules matching the given targets.
expand_special_targets –

Expand special target constants to explicit class names with backward compatibility.
get_no_split_params –

Get list of module classes that shouldn't be split when sharding. For
infer_sequential_targets –

Infer or validate sequential targets for layer-wise processing.
qat_active –

Determines if any layers in the model have quantization enabled by checking for

build_parameterized_layers

build_parameterized_layers(
    model: Module,
    targets: str | list[str],
    param_name: str = "weight",
) -> dict[str, ModelParameterizedLayer]

Build ModelParameterizedLayer objects for modules matching the given targets.

This function replaces get_layers_params() by using compressed-tensors' match_named_modules() to find matching modules and their parameters, then constructing ModelParameterizedLayer objects.

Parameters:

model (Module) –

The model to search for matching modules
targets (str | list[str]) –

Target patterns to match (supports class names, regex with "re:", and special constants for backward compatibility)
param_name (str, default: 'weight' ) –

Name of the parameter to extract from each layer (default: "weight")

Returns:

dict[str, ModelParameterizedLayer] –

Dictionary mapping layer names to ModelParameterizedLayer objects

Source code in src/llmcompressor/utils/pytorch/module.py

def build_parameterized_layers(
    model: Module,
    targets: str | list[str],
    param_name: str = "weight",
) -> dict[str, ModelParameterizedLayer]:
    """
    Build ModelParameterizedLayer objects for modules matching the given targets.

    This function replaces get_layers_params() by using compressed-tensors'
    match_named_modules() to find matching modules and their parameters,
    then constructing ModelParameterizedLayer objects.

    :param model: The model to search for matching modules
    :param targets: Target patterns to match (supports class names, regex with "re:",
                    and special constants for backward compatibility)
    :param param_name: Name of the parameter to extract from each layer
        (default: "weight")
    :return: Dictionary mapping layer names to ModelParameterizedLayer objects
    """
    # Expand special constants if present
    targets = expand_special_targets(targets)

    parameterized_layers = {}
    for layer_name, module in match_named_modules(model, targets):
        # Get the parameter from the module
        param = getattr(module, param_name, None)
        if param is None:
            continue

        # Avoid duplicate entries (same layer can be matched multiple times)
        if layer_name not in parameterized_layers:
            parameterized_layers[layer_name] = ModelParameterizedLayer(
                layer_name=layer_name,
                layer=module,
                param_name=f"{layer_name}.{param_name}",
                param=param,
            )

    return parameterized_layers

expand_special_targets

expand_special_targets(
    targets: str | list[str],
) -> list[str]

Expand special target constants to explicit class names with backward compatibility.

Special constants like ALL_PRUNABLE and ALL_QUANTIZABLE are deprecated in favor of explicit class name lists. This function provides backward compatibility by expanding these constants while issuing deprecation warnings.

Parameters:

targets (str | list[str]) –

Target strings which may include special constants

Returns:

list[str] –

List of expanded target strings

Raises:

ValueError –

If ALL constant is used (no longer supported)

Source code in src/llmcompressor/utils/pytorch/module.py

def expand_special_targets(targets: str | list[str]) -> list[str]:
    """
    Expand special target constants to explicit class names with backward compatibility.

    Special constants like __ALL_PRUNABLE__ and __ALL_QUANTIZABLE__ are deprecated
    in favor of explicit class name lists. This function provides backward compatibility
    by expanding these constants while issuing deprecation warnings.

    :param targets: Target strings which may include special constants
    :return: List of expanded target strings
    :raises ValueError: If __ALL__ constant is used (no longer supported)
    """
    if isinstance(targets, str):
        targets = [targets]

    expanded = []
    for target in targets:
        if target == ALL_PRUNABLE_TARGET:
            warnings.warn(
                f"{ALL_PRUNABLE_TARGET} is deprecated. "
                "Use explicit targets: ['Linear', 'Conv1d', 'Conv2d', 'Conv3d']",
                DeprecationWarning,
                stacklevel=3,
            )
            expanded.extend(["Linear", "Conv1d", "Conv2d", "Conv3d"])
        elif target == ALL_QUANTIZABLE_TARGET:
            warnings.warn(
                f"{ALL_QUANTIZABLE_TARGET} is deprecated. "
                "Use explicit targets: ['Linear', 'Conv2d', 'Conv3d']",
                DeprecationWarning,
                stacklevel=3,
            )
            expanded.extend(["Linear", "Conv2d", "Conv3d"])
        elif target == ALL_TARGET:
            raise ValueError(
                f"{ALL_TARGET} is no longer supported. "
                "Use explicit layer types or patterns instead."
            )
        else:
            expanded.append(target)

    return expanded

get_no_split_params

get_no_split_params(
    model: PreTrainedModel,
) -> str | list[str]

Get list of module classes that shouldn't be split when sharding. For Hugging Face Transformer models, this is the decoder layer type. For other types of models, this just returns all module names.

Returns:

str | list[str] –

list of class names that shouldn't be split

Source code in src/llmcompressor/utils/pytorch/module.py

def get_no_split_params(model: PreTrainedModel) -> str | list[str]:
    """
    Get list of module classes that shouldn't be split when sharding. For
    Hugging Face Transformer models, this is the decoder layer type. For other
    types of models, this just returns all module names.

    :return: list of class names that shouldn't be split
    """
    try:
        # Transformers < v5 support
        no_split_modules = model._get_no_split_modules("auto")
    except AttributeError:
        # Transformers v5 support
        no_split_modules = model._no_split_modules
    if len(no_split_modules) <= 0:
        return ALL_TARGET

    return no_split_modules

infer_sequential_targets

infer_sequential_targets(
    model: Module,
    sequential_targets: str | list[str] | None = None,
) -> str | list[str]

Infer or validate sequential targets for layer-wise processing.

When sequential_targets is None, automatically infers targets using get_no_split_params(). When provided as a string, wraps it in a list. Otherwise, returns the provided list as-is.

Parameters:

model (Module) –

The model to infer targets from
sequential_targets (str | list[str] | None, default: None ) –

Optional sequential targets to use. If None, targets are inferred from the model. If a string, it's wrapped in a list.

Returns:

str | list[str] –

List of sequential target class names or patterns

Source code in src/llmcompressor/utils/pytorch/module.py

def infer_sequential_targets(
    model: Module, sequential_targets: str | list[str] | None = None
) -> str | list[str]:
    """
    Infer or validate sequential targets for layer-wise processing.

    When sequential_targets is None, automatically infers targets using
    get_no_split_params(). When provided as a string, wraps it in a list.
    Otherwise, returns the provided list as-is.

    :param model: The model to infer targets from
    :param sequential_targets: Optional sequential targets to use. If None,
        targets are inferred from the model. If a string, it's wrapped in a list.
    :return: List of sequential target class names or patterns
    """
    match sequential_targets:
        case None:
            return get_no_split_params(model)
        case str():
            return [sequential_targets]
        case _:
            return sequential_targets

qat_active

qat_active(module: Module) -> bool

Determines if any layers in the model have quantization enabled by checking for weight_fake_quant attributes

Parameters:

module (Module) –

PyTorch model to check for quantization

Returns:

bool –

True if quantization is active anywhere in the model, False otherwise

Source code in src/llmcompressor/utils/pytorch/module.py

def qat_active(module: Module) -> bool:
    """
    Determines if any layers in the model have quantization enabled by checking for
    weight_fake_quant attributes

    :param module: PyTorch model to check for quantization
    :return: True if quantization is active anywhere in the model, False otherwise
    """
    for _, layer in module.named_modules():
        if isinstance(layer, torch.quantization.FakeQuantize):
            return True
        if is_module_quantized(layer):
            return True

    return False