llmcompressor.core
Provides the core compression framework for LLM Compressor.
The core API manages compression sessions, tracks state changes, handles events during compression, and Provides lifecycle hooks for the compression process.
Modules:
-
events–LLM Compressor Core Events Package
-
lifecycle–Module for managing the compression lifecycle in the LLM Compressor.
-
model_layer–Model layer utility classes for LLM compression workflows.
-
session–Compression session management for LLM compression workflows.
-
session_functions–Session management functions for LLM compression workflows.
-
state–Module for managing LLM Compressor state.
Classes:
-
CompressionLifecycle–A class for managing the lifecycle of compression events in the LLM Compressor.
-
CompressionSession–A session for compression that holds the lifecycle
-
Data–A dataclass to hold different data sets for training, validation,
-
Event–A class for defining an event that can be triggered during sparsification.
-
EventType–An Enum for defining the different types of events that can be triggered
-
Hardware–A dataclass to hold information about the hardware being used.
-
LifecycleCallbacks–A class for invoking lifecycle events for the active session
-
ModelParameterizedLayer–A dataclass for holding a parameter and its layer
-
ModifiedState–A dataclass to represent a modified model, optimizer, and loss.
-
State–State class holds information about the current compression state.
Functions:
-
active_session–:return: the active session for sparsification
-
create_session–Context manager to create and yield a new session.
-
reset_session–Reset the currently active session to its initial state
CompressionLifecycle
dataclass
CompressionLifecycle(
state: State = State(),
recipe: Recipe = Recipe(),
initialized_: bool = False,
finalized: bool = False,
_last_event_type: EventType
| None = EventType.BATCH_END,
_event_order: list[EventType] = (
lambda: [
EventType.BATCH_START,
EventType.LOSS_CALCULATED,
EventType.OPTIM_PRE_STEP,
EventType.OPTIM_POST_STEP,
EventType.BATCH_END,
]
)(),
global_step: int = 0,
)
A class for managing the lifecycle of compression events in the LLM Compressor.
Parameters:
-
state(State, default:State()) –The current state of the compression process
-
recipe(Recipe, default:Recipe()) –The compression recipe
-
modifiers(list[StageModifiers]) –The list of stage modifiers
Methods:
-
event–Handle a compression event.
-
finalize–Finalize the compression lifecycle.
-
initialize–Initialize the compression lifecycle.
-
reset–Reset the compression lifecycle, finalizing any active modifiers
event
Handle a compression event.
Parameters:
-
event_type(EventType) –The type of event to handle
-
kwargs–Additional arguments to pass to the event handlers
Returns:
-
List[Any]–List of data returned from handling the event by modifiers
Raises:
-
ValueError–If called before initialization, after finalization, or for an invalid event type
Source code in src/llmcompressor/core/lifecycle.py
finalize
Finalize the compression lifecycle.
Parameters:
-
kwargs–Additional arguments to update the state with
Returns:
-
List[Any]–List of data returned from finalizing modifiers
Raises:
-
ValueError–If called before initialization or more than once
Source code in src/llmcompressor/core/lifecycle.py
initialize
initialize(
recipe: RecipeInput | None = None,
recipe_stage: RecipeStageInput | None = None,
recipe_args: RecipeArgsInput | None = None,
**kwargs,
) -> list[Any]
Initialize the compression lifecycle.
Parameters:
-
kwargs–Additional arguments to update the state with
Returns:
-
List[Any]–List of data returned from initialization of modifiers
Source code in src/llmcompressor/core/lifecycle.py
reset
Reset the compression lifecycle, finalizing any active modifiers and resetting all attributes.
Source code in src/llmcompressor/core/lifecycle.py
CompressionSession
A session for compression that holds the lifecycle and state for the current compression session
Methods:
-
event–Invoke an event for current CompressionSession.
-
finalize–Finalize the session for compression. This will run the finalize method
-
get_serialized_recipe–:return: serialized string of the current compiled recipe
-
initialize–Initialize the session for compression. This will run the initialize method
-
reset–Reset the session to its initial state
-
reset_stage–Reset the session for starting a new stage, recipe and model stays intact
Attributes:
-
lifecycle(CompressionLifecycle) –Lifecycle is used to keep track of where we are in the compression
-
state(State) –State of the current compression session. State instance
Source code in src/llmcompressor/core/session.py
lifecycle
property
Lifecycle is used to keep track of where we are in the compression process and what modifiers are active. It also provides the ability to invoke events on the lifecycle.
Returns:
-
CompressionLifecycle–the lifecycle for the session
state
property
State of the current compression session. State instance is used to store all information such as the recipe, model optimizer, data, etc. that is needed for compression.
Returns:
-
State–the current state of the session
event
event(
event_type: EventType,
batch_data: Any | None = None,
loss: Any | None = None,
**kwargs,
) -> ModifiedState
Invoke an event for current CompressionSession.
Parameters:
-
event_type(EventType) –the event type to invoke
-
batch_data(Any | None, default:None) –the batch data to use for the event
-
loss(Any | None, default:None) –the loss to use for the event if any
-
kwargs–additional kwargs to pass to the lifecycle's event method
Returns:
-
ModifiedState–the modified state of the session after invoking the event
Source code in src/llmcompressor/core/session.py
finalize
Finalize the session for compression. This will run the finalize method for each modifier in the session's lifecycle. This will also set the session's state to the finalized state.
Parameters:
-
kwargs–additional kwargs to pass to the lifecycle's finalize method
Returns:
-
ModifiedState–the modified state of the session after finalizing
Source code in src/llmcompressor/core/session.py
get_serialized_recipe
Returns:
-
str | None–serialized string of the current compiled recipe
Source code in src/llmcompressor/core/session.py
initialize
initialize(
recipe: str
| list[str]
| Recipe
| list[Recipe]
| None = None,
recipe_stage: str | list[str] | None = None,
recipe_args: dict[str, Any] | None = None,
model: Any | None = None,
teacher_model: Any | None = None,
optimizer: Any | None = None,
attach_optim_callbacks: bool = True,
train_data: Any | None = None,
val_data: Any | None = None,
test_data: Any | None = None,
calib_data: Any | None = None,
copy_data: bool = True,
start: float | None = None,
steps_per_epoch: int | None = None,
batches_per_step: int | None = None,
**kwargs,
) -> ModifiedState
Initialize the session for compression. This will run the initialize method for each modifier in the session's lifecycle. This will also set the session's state to the initialized state.
Parameters:
-
recipe(str | list[str] | Recipe | list[Recipe] | None, default:None) –the recipe to use for the compression, can be a path to a recipe file, a raw recipe string, a recipe object, or a list of recipe objects.
-
recipe_stage(str | list[str] | None, default:None) –the stage to target for the compression
-
recipe_args(dict[str, Any] | None, default:None) –the args to use for overriding the recipe defaults
-
model(Any | None, default:None) –the model to compress
-
teacher_model(Any | None, default:None) –the teacher model to use for knowledge distillation
-
optimizer(Any | None, default:None) –the optimizer to use for the compression
-
attach_optim_callbacks(bool, default:True) –True to attach the optimizer callbacks to the compression lifecycle, False otherwise
-
train_data(Any | None, default:None) –the training data to use for the compression
-
val_data(Any | None, default:None) –the validation data to use for the compression
-
test_data(Any | None, default:None) –the testing data to use for the compression
-
calib_data(Any | None, default:None) –the calibration data to use for the compression
-
copy_data(bool, default:True) –True to copy the data, False otherwise
-
start(float | None, default:None) –the start epoch to use for the compression
-
steps_per_epoch(int | None, default:None) –the number of steps per epoch to use for the compression
-
batches_per_step(int | None, default:None) –the number of batches per step to use for compression
-
kwargs–additional kwargs to pass to the lifecycle's initialize method
Returns:
-
ModifiedState–the modified state of the session after initializing
Source code in src/llmcompressor/core/session.py
reset
reset_stage
Reset the session for starting a new stage, recipe and model stays intact
Data
dataclass
Data(
train: Any | None = None,
val: Any | None = None,
test: Any | None = None,
calib: Any | None = None,
)
A dataclass to hold different data sets for training, validation, testing, and/or calibration. Each data set is a ModifiableData instance.
Parameters:
-
train(Any | None, default:None) –The training data set
-
val(Any | None, default:None) –The validation data set
-
test(Any | None, default:None) –The testing data set
-
calib(Any | None, default:None) –The calibration data set
Event
dataclass
Event(
type_: Optional[EventType] = None,
steps_per_epoch: Optional[int] = None,
batches_per_step: Optional[int] = None,
invocations_per_step: int = 1,
global_step: int = 0,
global_batch: int = 0,
)
A class for defining an event that can be triggered during sparsification.
Parameters:
-
type_(Optional[EventType], default:None) –The type of event.
-
steps_per_epoch(Optional[int], default:None) –The number of steps per epoch.
-
batches_per_step(Optional[int], default:None) –The number of batches per step where step is an optimizer step invocation. For most pathways, these are the same. See the invocations_per_step parameter for more details when they are not.
-
invocations_per_step(int, default:1) –The number of invocations of the step wrapper before optimizer.step was called. Generally can be left as 1 (default). For older amp pathways, this is the number of times the scaler wrapper was invoked before the wrapped optimizer step function was called to handle accumulation in fp16.
-
global_step(int, default:0) –The current global step.
-
global_batch(int, default:0) –The current global batch.
Methods:
-
new_instance–Creates a new instance of the event with the provided keyword arguments.
-
should_update–Determines if the event should trigger an update.
Attributes:
-
current_index(float) –Calculates the current index of the event.
-
epoch(int) –Calculates the current epoch.
-
epoch_based(bool) –Determines if the event is based on epochs.
-
epoch_batch(int) –Calculates the current batch within the current epoch.
-
epoch_full(float) –Calculates the current epoch with the fraction of the current step.
-
epoch_step(int) –Calculates the current step within the current epoch.
current_index
property
writable
Calculates the current index of the event.
Returns:
-
float–The current index of the event, which is either the global step or the epoch with the fraction of the current step.
Raises:
-
ValueError–if the event is not epoch based or if the steps per epoch are too many.
epoch
property
Calculates the current epoch.
Returns:
-
int–The current epoch.
Raises:
-
ValueError–if the event is not epoch based.
epoch_based
property
Determines if the event is based on epochs.
Returns:
-
bool–True if the event is based on epochs, False otherwise.
epoch_batch
property
Calculates the current batch within the current epoch.
Returns:
-
int–The current batch within the current epoch.
Raises:
-
ValueError–if the event is not epoch based.
epoch_full
property
Calculates the current epoch with the fraction of the current step.
Returns:
-
float–The current epoch with the fraction of the current step.
Raises:
-
ValueError–if the event is not epoch based.
epoch_step
property
Calculates the current step within the current epoch.
Returns:
-
int–The current step within the current epoch.
Raises:
-
ValueError–if the event is not epoch based.
new_instance
Creates a new instance of the event with the provided keyword arguments.
Parameters:
-
kwargs–Keyword arguments to set in the new instance.
Returns:
-
Event–A new instance of the event with the provided kwargs.
Source code in src/llmcompressor/core/events/event.py
should_update
Determines if the event should trigger an update.
Parameters:
-
start(Optional[float]) –The start index to check against, set to None to ignore start.
-
end(Optional[float]) –The end index to check against, set to None to ignore end.
-
update(Optional[float]) –The update interval, set to None or 0.0 to always update, otherwise must be greater than 0.0, defaults to None.
Returns:
-
bool–True if the event should trigger an update, False otherwise.
Source code in src/llmcompressor/core/events/event.py
EventType
Bases: Enum
An Enum for defining the different types of events that can be triggered during model compression lifecycles. The purpose of each EventType is to trigger the corresponding modifier callback during training or post training pipelines.
Parameters:
-
INITIALIZE–Event type for initialization.
-
FINALIZE–Event type for finalization.
-
BATCH_START–Event type for the start of a batch.
-
LOSS_CALCULATED–Event type for when loss is calculated.
-
BATCH_END–Event type for the end of a batch.
-
CALIBRATION_EPOCH_START–Event type for the start of a calibration epoch.
-
SEQUENTIAL_EPOCH_END–Event type for the end of a layer calibration epoch, specifically used by
src/llmcompressor/pipelines/sequential/pipeline.py -
CALIBRATION_EPOCH_END–Event type for the end of a calibration epoch.
-
OPTIM_PRE_STEP–Event type for pre-optimization step.
-
OPTIM_POST_STEP–Event type for post-optimization step.
Hardware
dataclass
Hardware(
device: str | None = None,
devices: list[str] | None = None,
rank: int | None = None,
world_size: int | None = None,
local_rank: int | None = None,
local_world_size: int | None = None,
distributed: bool | None = None,
distributed_strategy: str | None = None,
)
A dataclass to hold information about the hardware being used.
Parameters:
-
device(str | None, default:None) –The current device being used for training
-
devices(list[str] | None, default:None) –List of all devices to be used for training
-
rank(int | None, default:None) –The rank of the current device
-
world_size(int | None, default:None) –The total number of devices being used
-
local_rank(int | None, default:None) –The local rank of the current device
-
local_world_size(int | None, default:None) –The total number of devices being used on the local machine
-
distributed(bool | None, default:None) –Whether or not distributed training is being used
-
distributed_strategy(str | None, default:None) –The distributed strategy being used
LifecycleCallbacks
A class for invoking lifecycle events for the active session
Methods:
-
batch_end–Invoke a batch end event for the active session
-
batch_start–Invoke a batch start event for the active session
-
calibration_epoch_end–Invoke a epoch end event for the active session during calibration. This event
-
calibration_epoch_start–Invoke a epoch start event for the active session during calibration. This event
-
event–Invoke an event for the active session
-
loss_calculated–Invoke a loss calculated event for the active session
-
optim_post_step–Invoke an optimizer post-step event for the active session
-
optim_pre_step–Invoke an optimizer pre-step event for the active session
-
sequential_epoch_end–Invoke a sequential epoch end event for the active session. This event should be
batch_end
classmethod
Invoke a batch end event for the active session
Parameters:
-
kwargs–additional kwargs to pass to the current session's event method
Returns:
-
ModifiedState–the modified state of the active session after invoking the event
Source code in src/llmcompressor/core/session_functions.py
batch_start
classmethod
Invoke a batch start event for the active session
Parameters:
-
batch_data(Optional[Any], default:None) –the batch data to use for the event
-
kwargs–additional kwargs to pass to the current session's event method
Returns:
-
ModifiedState–the modified state of the active session after invoking the event
Source code in src/llmcompressor/core/session_functions.py
calibration_epoch_end
classmethod
Invoke a epoch end event for the active session during calibration. This event should be called after the model has been calibrated for one epoch
see src/llmcompressor/pipelines/basic/pipeline.py for usage example
Source code in src/llmcompressor/core/session_functions.py
calibration_epoch_start
classmethod
Invoke a epoch start event for the active session during calibration. This event should be called before calibration starts for one epoch
see src/llmcompressor/pipelines/basic/pipeline.py for usage example
Source code in src/llmcompressor/core/session_functions.py
event
classmethod
Invoke an event for the active session
Parameters:
-
event_type(EventType) –the event type to invoke
-
kwargs–additional kwargs to pass to the current session's event method
Returns:
-
ModifiedState–the modified state of the active session after invoking the event
Source code in src/llmcompressor/core/session_functions.py
loss_calculated
classmethod
Invoke a loss calculated event for the active session
Parameters:
-
loss(Optional[Any], default:None) –the loss to use for the event
-
kwargs–additional kwargs to pass to the current session's event method
Returns:
-
ModifiedState–the modified state of the active session after invoking the event
Source code in src/llmcompressor/core/session_functions.py
optim_post_step
classmethod
Invoke an optimizer post-step event for the active session
Parameters:
-
kwargs–additional kwargs to pass to the current session's event method
Returns:
-
ModifiedState–the modified state of the active session after invoking the event
Source code in src/llmcompressor/core/session_functions.py
optim_pre_step
classmethod
Invoke an optimizer pre-step event for the active session
Parameters:
-
kwargs–additional kwargs to pass to the current session's event method
Returns:
-
ModifiedState–the modified state of the active session after invoking the event
Source code in src/llmcompressor/core/session_functions.py
sequential_epoch_end
classmethod
Invoke a sequential epoch end event for the active session. This event should be called after one sequential layer has been calibrated/trained for one epoch
This is called after a sequential layer has been calibrated with one batch, see
src/llmcompressor/pipelines/sequential/pipeline.py for usage example
Source code in src/llmcompressor/core/session_functions.py
ModelParameterizedLayer
dataclass
A dataclass for holding a parameter and its layer
Parameters:
-
layer_name(str) –the name of the layer
-
layer(Any) –the layer object
-
param_name(str) –the name of the parameter
-
param(Any) –the parameter object
ModifiedState
dataclass
A dataclass to represent a modified model, optimizer, and loss.
Parameters:
-
model(Optional[Any]) –The modified model
-
optimizer(Optional[Any]) –The modified optimizer
-
loss(Optional[Any]) –The modified loss
-
modifier_data(Optional[List[Dict[str, Any]]]) –The modifier data used to modify the model, optimizer, and loss
Initialize the ModifiedState with the given parameters.
Parameters:
-
model(Any) –The modified model
-
optimizer(Any) –The modified optimizer
-
loss(Any) –The modified loss
-
modifier_data(List[Dict[str, Any]]) –The modifier data used to modify the model, optimizer, and loss
Source code in src/llmcompressor/core/state.py
State
dataclass
State(
model: Any = None,
teacher_model: Any = None,
optimizer: Any = None,
optim_wrapped: bool = None,
loss: Any = None,
batch_data: Any = None,
data: Data = Data(),
hardware: Hardware = Hardware(),
loss_masks: list[Tensor] | None = None,
current_batch_idx: int = -1,
sequential_prefetch: bool = False,
)
State class holds information about the current compression state.
Parameters:
-
model(Any, default:None) –The model being used for compression
-
teacher_model(Any, default:None) –The teacher model being used for compression
-
optimizer(Any, default:None) –The optimizer being used for training
-
optim_wrapped(bool, default:None) –Whether or not the optimizer has been wrapped
-
loss(Any, default:None) –The loss function being used for training
-
batch_data(Any, default:None) –The current batch of data being used for compression
-
data(Data, default:Data()) –The data sets being used for training, validation, testing, and/or calibration, wrapped in a Data instance
-
hardware(Hardware, default:Hardware()) –Hardware instance holding info about the target hardware being used
Methods:
-
update–Update the state with the given parameters.
Attributes:
-
compression_ready(bool) –Check if the model and optimizer are set for compression.
compression_ready
property
Check if the model and optimizer are set for compression.
Returns:
-
bool–True if model and optimizer are set, False otherwise
update
update(
model: Any = None,
teacher_model: Any = None,
optimizer: Any = None,
attach_optim_callbacks: bool = True,
train_data: Any = None,
val_data: Any = None,
test_data: Any = None,
calib_data: Any = None,
copy_data: bool = True,
start: float = None,
steps_per_epoch: int = None,
batches_per_step: int = None,
**kwargs,
) -> dict
Update the state with the given parameters.
Parameters:
-
model(Any, default:None) –The model to update the state with
-
teacher_model(Any, default:None) –The teacher model to update the state with
-
optimizer(Any, default:None) –The optimizer to update the state with
-
attach_optim_callbacks(bool, default:True) –Whether or not to attach optimizer callbacks
-
train_data(Any, default:None) –The training data to update the state with
-
val_data(Any, default:None) –The validation data to update the state with
-
test_data(Any, default:None) –The testing data to update the state with
-
calib_data(Any, default:None) –The calibration data to update the state with
-
copy_data(bool, default:True) –Whether or not to copy the data
-
start(float, default:None) –The start index to update the state with
-
steps_per_epoch(int, default:None) –The steps per epoch to update the state with
-
batches_per_step(int, default:None) –The batches per step to update the state with
-
kwargs–Additional keyword arguments to update the state with
Returns:
-
Dict–The updated state as a dictionary
Source code in src/llmcompressor/core/state.py
121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 | |
active_session
Returns:
-
CompressionSession–the active session for sparsification
create_session
Context manager to create and yield a new session. This will set the active session to the new session for the duration of the context.
Returns:
-
Generator[CompressionSession, None, None]–the new session