llmcompressor.modifiers.utils.pytorch_helpers
apply_pad_mask_to_batch(batch)
Apply a mask to the input ids of a batch. This is used to zero out padding tokens so they do not contribute to the hessian calculation in the GPTQ and SparseGPT algorithms
Assumes that attention_mask only contains zeros and ones
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
batch | Dict[str, Tensor] | batch to apply padding to if it exists | required |
Returns:
| Type | Description |
|---|---|
Dict[str, Tensor] | batch with padding zeroed out in the input_ids |
Source code in llmcompressor/modifiers/utils/pytorch_helpers.py
is_moe_model(model)
Check if the model is a mixture of experts model
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model | Module | the model to check | required |
Returns:
| Type | Description |
|---|---|
bool | True if the model is a mixture of experts model |