llmcompressor.utils.pytorch
Modules:
-
module–Utility / helper functions
Functions:
-
build_parameterized_layers–Build ModelParameterizedLayer objects for modules matching the given targets.
-
expand_special_targets–Expand special target constants to explicit class names with backward compatibility.
-
get_no_split_params–Get list of module classes that shouldn't be split when sharding. For
-
infer_sequential_targets–Infer or validate sequential targets for layer-wise processing.
-
qat_active–Determines if any layers in the model have quantization enabled by checking for
build_parameterized_layers
build_parameterized_layers(
model: Module,
targets: Union[str, List[str]],
param_name: str = "weight",
) -> Dict[str, ModelParameterizedLayer]
Build ModelParameterizedLayer objects for modules matching the given targets.
This function replaces get_layers_params() by using compressed-tensors' match_named_modules() to find matching modules and their parameters, then constructing ModelParameterizedLayer objects.
Parameters:
-
model(Module) –The model to search for matching modules
-
targets(Union[str, List[str]]) –Target patterns to match (supports class names, regex with "re:", and special constants for backward compatibility)
-
param_name(str, default:'weight') –Name of the parameter to extract from each layer (default: "weight")
Returns:
-
Dict[str, ModelParameterizedLayer]–Dictionary mapping layer names to ModelParameterizedLayer objects
Source code in src/llmcompressor/utils/pytorch/module.py
expand_special_targets
Expand special target constants to explicit class names with backward compatibility.
Special constants like ALL_PRUNABLE and ALL_QUANTIZABLE are deprecated in favor of explicit class name lists. This function provides backward compatibility by expanding these constants while issuing deprecation warnings.
Parameters:
-
targets(Union[str, List[str]]) –Target strings which may include special constants
Returns:
-
List[str]–List of expanded target strings
Raises:
-
ValueError–If ALL constant is used (no longer supported)
Source code in src/llmcompressor/utils/pytorch/module.py
get_no_split_params
Get list of module classes that shouldn't be split when sharding. For Hugging Face Transformer models, this is the decoder layer type. For other types of models, this just returns all module names.
Returns:
-
Union[str, List[str]]–list of class names that shouldn't be split
Source code in src/llmcompressor/utils/pytorch/module.py
infer_sequential_targets
infer_sequential_targets(
model: Module,
sequential_targets: Union[str, List[str], None] = None,
) -> Union[str, List[str]]
Infer or validate sequential targets for layer-wise processing.
When sequential_targets is None, automatically infers targets using get_no_split_params(). When provided as a string, wraps it in a list. Otherwise, returns the provided list as-is.
Parameters:
-
model(Module) –The model to infer targets from
-
sequential_targets(Union[str, List[str], None], default:None) –Optional sequential targets to use. If None, targets are inferred from the model. If a string, it's wrapped in a list.
Returns:
-
Union[str, List[str]]–List of sequential target class names or patterns
Source code in src/llmcompressor/utils/pytorch/module.py
qat_active
Determines if any layers in the model have quantization enabled by checking for weight_fake_quant attributes
Parameters:
-
module(Module) –PyTorch model to check for quantization
Returns:
-
bool–True if quantization is active anywhere in the model, False otherwise