llmcompressor.utils.dev
Functions:
-
dispatch_for_generation–Dispatch a model autoregressive generation. This means that modules are dispatched
-
patch_transformers_logger_level–Context under which the transformers logger's level is modified
-
skip_weights_download–Context manager under which models are initialized without having to download
dispatch_for_generation
Dispatch a model autoregressive generation. This means that modules are dispatched evenly across avaiable devices and kept onloaded if possible.
Parameters:
-
model–model to dispatch
-
hint_batch_size–reserve memory for batch size of inputs
-
hint_batch_seq_len–reserve memory for sequence of length of inputs
-
hint_model_dtype–reserve memory for model's dtype. Will be inferred from model if none is provided
-
hint_extra_memory–extra memory reserved for model serving
-
no_split_modules–names of module classes which should not be split across multiple devices
Returns:
-
PreTrainedModel–dispatched model
Source code in src/llmcompressor/utils/dev.py
patch_transformers_logger_level
Context under which the transformers logger's level is modified
This can be used with skip_weights_download to squelch warnings related to
missing parameters in the checkpoint
Parameters:
-
level(int, default:ERROR) –new logging level for transformers logger. Logs whose level is below this level will not be logged
Source code in src/llmcompressor/utils/dev.py
skip_weights_download
Context manager under which models are initialized without having to download
the model weight files. This differs from init_empty_weights in that weights are
allocated on to assigned devices with random values, as opposed to being on the meta
device
Parameters:
-
model_class(Type[PreTrainedModel], default:AutoModelForCausalLM) –class to patch, defaults to
AutoModelForCausalLM
Source code in src/llmcompressor/utils/dev.py
skip_weights_initialize
Very similar to transformers.model_utils.no_init_weights, except that torch.Tensor
initialization functions are also patched to account for tensors which are
initialized not on the meta device