llmcompressor.entrypoints.model_free
Modules:
Functions:
-
model_free_ptq–Quantize a model without the need for a model definition. This function operates on
model_free_ptq
model_free_ptq(
model_stub: str | PathLike,
save_directory: str | PathLike,
scheme: QuantizationScheme | str,
ignore: Optional[list[str]] = None,
max_workers: int = 1,
device: Optional[device | str] = None,
)
Quantize a model without the need for a model definition. This function operates on a model stub or folder containing weights saved in safetensors files
Parameters:
-
(model_stubstr | PathLike) –huggingface model hub or path to local weights files
-
(schemeQuantizationScheme | str) –weight quantization scheme or preset scheme name
-
(ignoreOptional[list[str]], default:None) –modules to ignore. Modules ending with "norm" are automatically ignored
-
(max_workersint, default:1) –number of worker threads to process files with
-
(deviceOptional[device | str], default:None) –gpu device to accelerate quantization with