llmcompressor.core

Provides the core compression framework for LLM Compressor.

The core API manages compression sessions, tracks state changes, handles events during compression, and Provides lifecycle hooks for the compression process.

Modules:

events –

LLM Compressor Core Events Package
lifecycle –

Module for managing the compression lifecycle in the LLM Compressor.
model_layer –

Model layer utility classes for LLM compression workflows.
session –

Compression session management for LLM compression workflows.
session_functions –

Session management functions for LLM compression workflows.
state –

Module for managing LLM Compressor state.

Classes:

CompressionLifecycle –

A class for managing the lifecycle of compression events in the LLM Compressor.
CompressionSession –

A session for compression that holds the lifecycle
Data –

A dataclass to hold different data sets for training, validation,
Event –

A class for defining an event that can be triggered during sparsification.
EventType –

An Enum for defining the different types of events that can be triggered
Hardware –

A dataclass to hold information about the hardware being used.
LifecycleCallbacks –

A class for invoking lifecycle events for the active session
ModelParameterizedLayer –

A dataclass for holding a parameter and its layer
ModifiedState –

A dataclass to represent a modified model, optimizer, and loss.
State –

State class holds information about the current compression state.

Functions:

active_session –

:return: the active session for sparsification
create_session –

Context manager to create and yield a new session.
reset_session –

Reset the currently active session to its initial state

CompressionLifecycle `dataclass`

CompressionLifecycle(
    state: State = State(),
    recipe: Recipe = Recipe(),
    initialized_: bool = False,
    finalized: bool = False,
    _last_event_type: EventType
    | None = EventType.BATCH_END,
    _event_order: list[EventType] = (
        lambda: [
            EventType.BATCH_START,
            EventType.LOSS_CALCULATED,
            EventType.OPTIM_PRE_STEP,
            EventType.OPTIM_POST_STEP,
            EventType.BATCH_END,
        ]
    )(),
    global_step: int = 0,
)

A class for managing the lifecycle of compression events in the LLM Compressor.

Parameters:

state (State, default: State() ) –

The current state of the compression process
recipe (Recipe, default: Recipe() ) –

The compression recipe
modifiers (list[StageModifiers]) –

The list of stage modifiers

Methods:

event –

Handle a compression event.
finalize –

Finalize the compression lifecycle.
initialize –

Initialize the compression lifecycle.
reset –

Reset the compression lifecycle, finalizing any active modifiers

event

event(
    event_type: EventType,
    global_step: int | None = 0,
    **kwargs,
) -> list[Any]

Handle a compression event.

Parameters:

event_type (EventType) –

The type of event to handle
kwargs –

Additional arguments to pass to the event handlers

Returns:

List[Any] –

List of data returned from handling the event by modifiers

Raises:

ValueError –

If called before initialization, after finalization, or for an invalid event type

Source code in src/llmcompressor/core/lifecycle.py

def event(
    self, event_type: EventType, global_step: int | None = 0, **kwargs
) -> list[Any]:
    """
    Handle a compression event.

    :param event_type: The type of event to handle
    :type event_type: EventType
    :param kwargs: Additional arguments to pass to the event handlers
    :return: List of data returned from handling the event by modifiers
    :rtype: List[Any]
    :raises ValueError: If called before initialization, after finalization,
        or for an invalid event type
    """
    if not self.initialized_:
        logger.error("Cannot invoke event before initializing")
        raise ValueError("Cannot invoke event before initializing")

    if self.finalized:
        logger.error("Cannot invoke event after finalizing")
        raise ValueError("Cannot invoke event after finalizing")

    if event_type in [EventType.INITIALIZE, EventType.FINALIZE]:
        logger.error(
            "Cannot invoke {} event. Use the corresponding method instead.",
            event_type,
        )
        raise ValueError(
            f"Cannot invoke {event_type} event. "
            f"Use the corresponding method instead."
        )

    if not self._validate_event_order(event_type):
        raise ValueError(
            f"Lifecycle events must appear following order: {self._event_order}. "
            f"Instead, {self._last_event_type} was called before {event_type}"
        )

    if event_type == EventType.LOSS_CALCULATED and (
        "loss" not in kwargs or kwargs["loss"] is None
    ):
        logger.error("Loss must be provided for loss calculated event")
        raise ValueError("Loss must be provided for loss calculated event")

    logger.debug("Handling event: {}", event_type)

    # update global step
    if global_step is not None:
        self.global_step = global_step

    event = Event(type_=event_type)
    mod_data = []
    for mod in self.recipe.modifiers:
        data = mod.update_event(state=self.state, event=event, **kwargs)
        logger.debug("Updated event with modifier: {}", mod)
        if data is not None:
            mod_data.append(data)

    assert (
        event is not None
    ), f"Event lifecycle did not return an event for {event_type}"

    return mod_data

finalize

finalize(**kwargs) -> list[Any]

Finalize the compression lifecycle.

Parameters:

kwargs –

Additional arguments to update the state with

Returns:

List[Any] –

List of data returned from finalizing modifiers

Raises:

ValueError –

If called before initialization or more than once

Source code in src/llmcompressor/core/lifecycle.py

def finalize(self, **kwargs) -> list[Any]:
    """
    Finalize the compression lifecycle.

    :param kwargs: Additional arguments to update the state with
    :return: List of data returned from finalizing modifiers
    :rtype: List[Any]
    :raises ValueError: If called before initialization or more than once
    """
    if not self.initialized_:
        logger.error("Cannot finalize before initializing")
        raise ValueError("Cannot finalize before initializing")

    if self.finalized:
        logger.error("Cannot finalize more than once")
        raise ValueError("Cannot finalize more than once")

    logger.debug("Finalizing compression lifecycle")
    mod_data = []
    for mod in self.recipe.modifiers:
        data = mod.finalize(state=self.state, **kwargs)
        logger.debug("Finalized modifier: {}", mod)
        if data is not None:
            mod_data.append(data)

    self.finalized = True

    logger.info(
        "Compression lifecycle finalized for {} modifiers",
        len(self.recipe.modifiers),
    )

    return mod_data

initialize

initialize(
    recipe: RecipeInput | None = None,
    recipe_stage: RecipeStageInput | None = None,
    recipe_args: RecipeArgsInput | None = None,
    **kwargs,
) -> list[Any]

Initialize the compression lifecycle.

Parameters:

kwargs –

Additional arguments to update the state with

Returns:

List[Any] –

List of data returned from initialization of modifiers

Source code in src/llmcompressor/core/lifecycle.py

def initialize(
    self,
    recipe: RecipeInput | None = None,
    recipe_stage: RecipeStageInput | None = None,
    recipe_args: RecipeArgsInput | None = None,
    **kwargs,
) -> list[Any]:
    """
    Initialize the compression lifecycle.

    :param kwargs: Additional arguments to update the state with
    :return: List of data returned from initialization of modifiers
    :rtype: List[Any]
    """

    self.state.update(**kwargs)
    if self.initialized_:  # TODO: do not initialize twice
        return

    logger.debug("Initializing compression lifecycle")
    if not recipe:
        self.recipe = Recipe()
    else:
        self.recipe = Recipe.create_instance(
            path_or_modifiers=recipe, target_stage=recipe_stage
        )
        if recipe_args:
            self.recipe.args = {**recipe_args}

    mod_data = []
    for mod in self.recipe.modifiers:
        data = mod.initialize(state=self.state, **kwargs)
        logger.debug("Initialized modifier: {}", mod)
        if data is not None:
            mod_data.append(data)

    self.initialized_ = True
    logger.info(
        "Compression lifecycle initialized for {} modifiers",
        len(self.recipe.modifiers),
    )

    return mod_data

reset

reset()

Reset the compression lifecycle, finalizing any active modifiers and resetting all attributes.

Source code in src/llmcompressor/core/lifecycle.py

def reset(self):
    """
    Reset the compression lifecycle, finalizing any active modifiers
    and resetting all attributes.
    """
    logger.debug("Resetting compression lifecycle")

    for mod in self.recipe.modifiers:
        if not mod.initialized or mod.finalized:
            continue
        try:
            mod.finalize(self.state)
            logger.debug("Finalized modifier: {}", mod)
        except Exception as e:
            logger.warning(f"Exception during finalizing modifier: {e}")

    self.__init__()
    logger.info("Compression lifecycle reset")

CompressionSession

CompressionSession()

A session for compression that holds the lifecycle and state for the current compression session

Methods:

event –

Invoke an event for current CompressionSession.
finalize –

Finalize the session for compression. This will run the finalize method
get_serialized_recipe –

:return: serialized string of the current compiled recipe
initialize –

Initialize the session for compression. This will run the initialize method
reset –

Reset the session to its initial state
reset_stage –

Reset the session for starting a new stage, recipe and model stays intact

Attributes:

lifecycle (CompressionLifecycle) –

Lifecycle is used to keep track of where we are in the compression
state (State) –

State of the current compression session. State instance

Source code in src/llmcompressor/core/session.py

def __init__(self):
    self._lifecycle = CompressionLifecycle()

lifecycle `property`

lifecycle: CompressionLifecycle

Lifecycle is used to keep track of where we are in the compression process and what modifiers are active. It also provides the ability to invoke events on the lifecycle.

Returns:

CompressionLifecycle –

the lifecycle for the session

state `property`

state: State

State of the current compression session. State instance is used to store all information such as the recipe, model optimizer, data, etc. that is needed for compression.

Returns:

State –

the current state of the session

event

event(
    event_type: EventType,
    batch_data: Any | None = None,
    loss: Any | None = None,
    **kwargs,
) -> ModifiedState

Invoke an event for current CompressionSession.

Parameters:

event_type (EventType) –

the event type to invoke
batch_data (Any | None, default: None ) –

the batch data to use for the event
loss (Any | None, default: None ) –

the loss to use for the event if any
kwargs –

additional kwargs to pass to the lifecycle's event method

Returns:

ModifiedState –

the modified state of the session after invoking the event

Source code in src/llmcompressor/core/session.py

def event(
    self,
    event_type: EventType,
    batch_data: Any | None = None,
    loss: Any | None = None,
    **kwargs,
) -> ModifiedState:
    """
    Invoke an event for current CompressionSession.

    :param event_type: the event type to invoke
    :param batch_data: the batch data to use for the event
    :param loss: the loss to use for the event if any
    :param kwargs: additional kwargs to pass to the lifecycle's event method
    :return: the modified state of the session after invoking the event
    """
    mod_data = self._lifecycle.event(
        event_type=event_type, batch_data=batch_data, loss=loss, **kwargs
    )
    return ModifiedState(
        model=self.state.model,
        optimizer=self.state.optimizer,
        loss=self.state.loss,  # TODO: is this supposed to be a different type?
        modifier_data=mod_data,
    )

finalize

finalize(**kwargs) -> ModifiedState

Finalize the session for compression. This will run the finalize method for each modifier in the session's lifecycle. This will also set the session's state to the finalized state.

Parameters:

kwargs –

additional kwargs to pass to the lifecycle's finalize method

Returns:

ModifiedState –

the modified state of the session after finalizing

Source code in src/llmcompressor/core/session.py

def finalize(self, **kwargs) -> ModifiedState:
    """
    Finalize the session for compression. This will run the finalize method
    for each modifier in the session's lifecycle. This will also set the session's
    state to the finalized state.

    :param kwargs: additional kwargs to pass to the lifecycle's finalize method
    :return: the modified state of the session after finalizing
    """
    mod_data = self._lifecycle.finalize(**kwargs)

    return ModifiedState(
        model=self.state.model,
        optimizer=self.state.optimizer,
        loss=self.state.loss,
        modifier_data=mod_data,
    )

get_serialized_recipe

get_serialized_recipe() -> str | None

Returns:

str | None –

serialized string of the current compiled recipe

Source code in src/llmcompressor/core/session.py

def get_serialized_recipe(self) -> str | None:
    """
    :return: serialized string of the current compiled recipe
    """
    recipe = self.lifecycle.recipe

    if recipe is not None and hasattr(recipe, "yaml"):
        return recipe.yaml()

    logger.warning("Recipe not found in session - it may have been reset")

initialize

initialize(
    recipe: str
    | list[str]
    | Recipe
    | list[Recipe]
    | None = None,
    recipe_stage: str | list[str] | None = None,
    recipe_args: dict[str, Any] | None = None,
    model: Any | None = None,
    teacher_model: Any | None = None,
    optimizer: Any | None = None,
    attach_optim_callbacks: bool = True,
    train_data: Any | None = None,
    val_data: Any | None = None,
    test_data: Any | None = None,
    calib_data: Any | None = None,
    copy_data: bool = True,
    start: float | None = None,
    steps_per_epoch: int | None = None,
    batches_per_step: int | None = None,
    **kwargs,
) -> ModifiedState

Initialize the session for compression. This will run the initialize method for each modifier in the session's lifecycle. This will also set the session's state to the initialized state.

Parameters:

recipe (str | list[str] | Recipe | list[Recipe] | None, default: None ) –

the recipe to use for the compression, can be a path to a recipe file, a raw recipe string, a recipe object, or a list of recipe objects.
recipe_stage (str | list[str] | None, default: None ) –

the stage to target for the compression
recipe_args (dict[str, Any] | None, default: None ) –

the args to use for overriding the recipe defaults
model (Any | None, default: None ) –

the model to compress
teacher_model (Any | None, default: None ) –

the teacher model to use for knowledge distillation
optimizer (Any | None, default: None ) –

the optimizer to use for the compression
attach_optim_callbacks (bool, default: True ) –

True to attach the optimizer callbacks to the compression lifecycle, False otherwise
train_data (Any | None, default: None ) –

the training data to use for the compression
val_data (Any | None, default: None ) –

the validation data to use for the compression
test_data (Any | None, default: None ) –

the testing data to use for the compression
calib_data (Any | None, default: None ) –

the calibration data to use for the compression
copy_data (bool, default: True ) –

True to copy the data, False otherwise
start (float | None, default: None ) –

the start epoch to use for the compression
steps_per_epoch (int | None, default: None ) –

the number of steps per epoch to use for the compression
batches_per_step (int | None, default: None ) –

the number of batches per step to use for compression
kwargs –

additional kwargs to pass to the lifecycle's initialize method

Returns:

ModifiedState –

the modified state of the session after initializing

Source code in src/llmcompressor/core/session.py

def initialize(
    self,
    recipe: str | list[str] | Recipe | list[Recipe] | None = None,
    recipe_stage: str | list[str] | None = None,
    recipe_args: dict[str, Any] | None = None,
    model: Any | None = None,
    teacher_model: Any | None = None,
    optimizer: Any | None = None,
    attach_optim_callbacks: bool = True,
    train_data: Any | None = None,
    val_data: Any | None = None,
    test_data: Any | None = None,
    calib_data: Any | None = None,
    copy_data: bool = True,
    start: float | None = None,
    steps_per_epoch: int | None = None,
    batches_per_step: int | None = None,
    **kwargs,
) -> ModifiedState:
    """
    Initialize the session for compression. This will run the initialize method
    for each modifier in the session's lifecycle. This will also set the session's
    state to the initialized state.

    :param recipe: the recipe to use for the compression, can be a path to a
        recipe file, a raw recipe string, a recipe object, or a list
        of recipe objects.
    :param recipe_stage: the stage to target for the compression
    :param recipe_args: the args to use for overriding the recipe defaults
    :param model: the model to compress
    :param teacher_model: the teacher model to use for knowledge distillation
    :param optimizer: the optimizer to use for the compression
    :param attach_optim_callbacks: True to attach the optimizer callbacks to the
        compression lifecycle, False otherwise
    :param train_data: the training data to use for the compression
    :param val_data: the validation data to use for the compression
    :param test_data: the testing data to use for the compression
    :param calib_data: the calibration data to use for the compression
    :param copy_data: True to copy the data, False otherwise
    :param start: the start epoch to use for the compression
    :param steps_per_epoch: the number of steps per epoch to use for the
        compression
    :param batches_per_step: the number of batches per step to use for
        compression
    :param kwargs: additional kwargs to pass to the lifecycle's initialize method
    :return: the modified state of the session after initializing
    """
    mod_data = self._lifecycle.initialize(
        recipe=recipe,
        recipe_stage=recipe_stage,
        recipe_args=recipe_args,
        model=model,
        teacher_model=teacher_model,
        optimizer=optimizer,
        attach_optim_callbacks=attach_optim_callbacks,
        train_data=train_data,
        val_data=val_data,
        test_data=test_data,
        calib_data=calib_data,
        copy_data=copy_data,
        start=start,
        steps_per_epoch=steps_per_epoch,
        batches_per_step=batches_per_step,
        **kwargs,
    )

    return ModifiedState(
        model=self.state.model,
        optimizer=self.state.optimizer,
        loss=self.state.loss,
        modifier_data=mod_data,
    )

reset

reset()

Reset the session to its initial state

Source code in src/llmcompressor/core/session.py

def reset(self):
    """
    Reset the session to its initial state
    """
    self._lifecycle.reset()

reset_stage

reset_stage()

Reset the session for starting a new stage, recipe and model stays intact

Source code in src/llmcompressor/core/session.py

def reset_stage(self):
    """
    Reset the session for starting a new stage, recipe and model stays intact
    """
    self.lifecycle.initialized_ = False
    self.lifecycle.finalized = False

Data `dataclass`

Data(
    train: Any | None = None,
    val: Any | None = None,
    test: Any | None = None,
    calib: Any | None = None,
)

A dataclass to hold different data sets for training, validation, testing, and/or calibration. Each data set is a ModifiableData instance.

Parameters:

train (Any | None, default: None ) –

The training data set
val (Any | None, default: None ) –

The validation data set
test (Any | None, default: None ) –

The testing data set
calib (Any | None, default: None ) –

The calibration data set

Event `dataclass`

Event(
    type_: Optional[EventType] = None,
    steps_per_epoch: Optional[int] = None,
    batches_per_step: Optional[int] = None,
    invocations_per_step: int = 1,
    global_step: int = 0,
    global_batch: int = 0,
)

A class for defining an event that can be triggered during sparsification.

Parameters:

type_ (Optional[EventType], default: None ) –

The type of event.
steps_per_epoch (Optional[int], default: None ) –

The number of steps per epoch.
batches_per_step (Optional[int], default: None ) –

The number of batches per step where step is an optimizer step invocation. For most pathways, these are the same. See the invocations_per_step parameter for more details when they are not.
invocations_per_step (int, default: 1 ) –

The number of invocations of the step wrapper before optimizer.step was called. Generally can be left as 1 (default). For older amp pathways, this is the number of times the scaler wrapper was invoked before the wrapped optimizer step function was called to handle accumulation in fp16.
global_step (int, default: 0 ) –

The current global step.
global_batch (int, default: 0 ) –

The current global batch.

Methods:

new_instance –

Creates a new instance of the event with the provided keyword arguments.
should_update –

Determines if the event should trigger an update.

Attributes:

current_index (float) –

Calculates the current index of the event.
epoch (int) –

Calculates the current epoch.
epoch_based (bool) –

Determines if the event is based on epochs.
epoch_batch (int) –

Calculates the current batch within the current epoch.
epoch_full (float) –

Calculates the current epoch with the fraction of the current step.
epoch_step (int) –

Calculates the current step within the current epoch.

current_index `property` `writable`

current_index: float

Calculates the current index of the event.

Returns:

float –

The current index of the event, which is either the global step or the epoch with the fraction of the current step.

Raises:

ValueError –

if the event is not epoch based or if the steps per epoch are too many.

epoch `property`

epoch: int

Calculates the current epoch.

Returns:

int –

The current epoch.

Raises:

ValueError –

if the event is not epoch based.

epoch_based `property`

epoch_based: bool

Determines if the event is based on epochs.

Returns:

bool –

True if the event is based on epochs, False otherwise.

epoch_batch `property`

epoch_batch: int

Calculates the current batch within the current epoch.

Returns:

int –

The current batch within the current epoch.

Raises:

ValueError –

if the event is not epoch based.

epoch_full `property`

epoch_full: float

Calculates the current epoch with the fraction of the current step.

Returns:

float –

The current epoch with the fraction of the current step.

Raises:

ValueError –

if the event is not epoch based.

epoch_step `property`

epoch_step: int

Calculates the current step within the current epoch.

Returns:

int –

The current step within the current epoch.

Raises:

ValueError –

if the event is not epoch based.

new_instance

new_instance(**kwargs) -> Event

Creates a new instance of the event with the provided keyword arguments.

Parameters:

kwargs –

Keyword arguments to set in the new instance.

Returns:

Event –

A new instance of the event with the provided kwargs.

Source code in src/llmcompressor/core/events/event.py

def new_instance(self, **kwargs) -> "Event":
    """
    Creates a new instance of the event with the provided keyword arguments.

    :param kwargs: Keyword arguments to set in the new instance.
    :return: A new instance of the event with the provided kwargs.
    :rtype: Event
    """
    logger.debug("Creating new instance of event with kwargs: {}", kwargs)
    instance = deepcopy(self)
    for key, value in kwargs.items():
        setattr(instance, key, value)
    return instance

should_update

should_update(
    start: Optional[float],
    end: Optional[float],
    update: Optional[float],
) -> bool

Determines if the event should trigger an update.

Parameters:

start (Optional[float]) –

The start index to check against, set to None to ignore start.
end (Optional[float]) –

The end index to check against, set to None to ignore end.
update (Optional[float]) –

The update interval, set to None or 0.0 to always update, otherwise must be greater than 0.0, defaults to None.

Returns:

bool –

True if the event should trigger an update, False otherwise.

Source code in src/llmcompressor/core/events/event.py

def should_update(
    self, start: Optional[float], end: Optional[float], update: Optional[float]
) -> bool:
    """
    Determines if the event should trigger an update.

    :param start: The start index to check against, set to None to ignore start.
    :type start: Optional[float]
    :param end: The end index to check against, set to None to ignore end.
    :type end: Optional[float]
    :param update: The update interval, set to None or 0.0 to always update,
        otherwise must be greater than 0.0, defaults to None.
    :type update: Optional[float]
    :return: True if the event should trigger an update, False otherwise.
    :rtype: bool
    """
    current = self.current_index
    logger.debug(
        "Checking if event should update: "
        "current_index={}, start={}, end={}, update={}",
        current,
        start,
        end,
        update,
    )
    if start is not None and current < start:
        return False
    if end is not None and current > end:
        return False
    return update is None or update <= 0.0 or current % update < 1e-10

EventType

Bases: Enum

An Enum for defining the different types of events that can be triggered during model compression lifecycles. The purpose of each EventType is to trigger the corresponding modifier callback during training or post training pipelines.

Parameters:

INITIALIZE –

Event type for initialization.
FINALIZE –

Event type for finalization.
BATCH_START –

Event type for the start of a batch.
LOSS_CALCULATED –

Event type for when loss is calculated.
BATCH_END –

Event type for the end of a batch.
CALIBRATION_EPOCH_START –

Event type for the start of a calibration epoch.
SEQUENTIAL_EPOCH_END –

Event type for the end of a layer calibration epoch, specifically used by src/llmcompressor/pipelines/sequential/pipeline.py
CALIBRATION_EPOCH_END –

Event type for the end of a calibration epoch.
OPTIM_PRE_STEP –

Event type for pre-optimization step.
OPTIM_POST_STEP –

Event type for post-optimization step.

Hardware `dataclass`

Hardware(
    device: str | None = None,
    devices: list[str] | None = None,
    rank: int | None = None,
    world_size: int | None = None,
    local_rank: int | None = None,
    local_world_size: int | None = None,
    distributed: bool | None = None,
    distributed_strategy: str | None = None,
)

A dataclass to hold information about the hardware being used.

Parameters:

device (str | None, default: None ) –

The current device being used for training
devices (list[str] | None, default: None ) –

List of all devices to be used for training
rank (int | None, default: None ) –

The rank of the current device
world_size (int | None, default: None ) –

The total number of devices being used
local_rank (int | None, default: None ) –

The local rank of the current device
local_world_size (int | None, default: None ) –

The total number of devices being used on the local machine
distributed (bool | None, default: None ) –

Whether or not distributed training is being used
distributed_strategy (str | None, default: None ) –

The distributed strategy being used

LifecycleCallbacks

A class for invoking lifecycle events for the active session

Methods:

batch_end –

Invoke a batch end event for the active session
batch_start –

Invoke a batch start event for the active session
calibration_epoch_end –

Invoke a epoch end event for the active session during calibration. This event
calibration_epoch_start –

Invoke a epoch start event for the active session during calibration. This event
event –

Invoke an event for the active session
loss_calculated –

Invoke a loss calculated event for the active session
optim_post_step –

Invoke an optimizer post-step event for the active session
optim_pre_step –

Invoke an optimizer pre-step event for the active session
sequential_epoch_end –

Invoke a sequential epoch end event for the active session. This event should be

batch_end `classmethod`

batch_end(**kwargs) -> ModifiedState

Invoke a batch end event for the active session

Parameters:

kwargs –

additional kwargs to pass to the current session's event method

Returns:

ModifiedState –

the modified state of the active session after invoking the event

Source code in src/llmcompressor/core/session_functions.py

@classmethod
def batch_end(cls, **kwargs) -> ModifiedState:
    """
    Invoke a batch end event for the active session

    :param kwargs: additional kwargs to pass to the current session's event method
    :return: the modified state of the active session after invoking the event
    """
    return cls.event(EventType.BATCH_END, **kwargs)

batch_start `classmethod`

batch_start(
    batch_data: Optional[Any] = None, **kwargs
) -> ModifiedState

Invoke a batch start event for the active session

Parameters:

batch_data (Optional[Any], default: None ) –

the batch data to use for the event
kwargs –

additional kwargs to pass to the current session's event method

Returns:

ModifiedState –

the modified state of the active session after invoking the event

Source code in src/llmcompressor/core/session_functions.py

@classmethod
def batch_start(cls, batch_data: Optional[Any] = None, **kwargs) -> ModifiedState:
    """
    Invoke a batch start event for the active session

    :param batch_data: the batch data to use for the event
    :param kwargs: additional kwargs to pass to the current session's event method
    :return: the modified state of the active session after invoking the event
    """
    return cls.event(EventType.BATCH_START, batch_data=batch_data, **kwargs)

calibration_epoch_end `classmethod`

calibration_epoch_end(**kwargs) -> ModifiedState

Invoke a epoch end event for the active session during calibration. This event should be called after the model has been calibrated for one epoch

see src/llmcompressor/pipelines/basic/pipeline.py for usage example

Source code in src/llmcompressor/core/session_functions.py

@classmethod
def calibration_epoch_end(cls, **kwargs) -> ModifiedState:
    """
    Invoke a epoch end event for the active session during calibration. This event
    should be called after the model has been calibrated for one epoch

    see `src/llmcompressor/pipelines/basic/pipeline.py` for usage example
    """
    return cls.event(EventType.CALIBRATION_EPOCH_END, **kwargs)

calibration_epoch_start `classmethod`

calibration_epoch_start(**kwargs) -> ModifiedState

Invoke a epoch start event for the active session during calibration. This event should be called before calibration starts for one epoch

see src/llmcompressor/pipelines/basic/pipeline.py for usage example

Source code in src/llmcompressor/core/session_functions.py

@classmethod
def calibration_epoch_start(cls, **kwargs) -> ModifiedState:
    """
    Invoke a epoch start event for the active session during calibration. This event
    should be called before calibration starts for one epoch

    see `src/llmcompressor/pipelines/basic/pipeline.py` for usage example
    """
    return cls.event(EventType.CALIBRATION_EPOCH_START, **kwargs)

event `classmethod`

event(event_type: EventType, **kwargs) -> ModifiedState

Invoke an event for the active session

Parameters:

event_type (EventType) –

the event type to invoke
kwargs –

additional kwargs to pass to the current session's event method

Returns:

ModifiedState –

the modified state of the active session after invoking the event

Source code in src/llmcompressor/core/session_functions.py

@classmethod
def event(cls, event_type: EventType, **kwargs) -> ModifiedState:
    """
    Invoke an event for the active session

    :param event_type: the event type to invoke
    :param kwargs: additional kwargs to pass to the current session's event method
    :return: the modified state of the active session after invoking the event
    """
    if event_type in [EventType.INITIALIZE, EventType.FINALIZE]:
        raise ValueError(
            f"Cannot invoke {event_type} event. "
            f"Use the corresponding method instead."
        )

    return active_session().event(event_type, **kwargs)

loss_calculated `classmethod`

loss_calculated(
    loss: Optional[Any] = None, **kwargs
) -> ModifiedState

Invoke a loss calculated event for the active session

Parameters:

loss (Optional[Any], default: None ) –

the loss to use for the event
kwargs –

additional kwargs to pass to the current session's event method

Returns:

ModifiedState –

the modified state of the active session after invoking the event

Source code in src/llmcompressor/core/session_functions.py

@classmethod
def loss_calculated(cls, loss: Optional[Any] = None, **kwargs) -> ModifiedState:
    """
    Invoke a loss calculated event for the active session

    :param loss: the loss to use for the event
    :param kwargs: additional kwargs to pass to the current session's event method
    :return: the modified state of the active session after invoking the event
    """
    logger.debug(f"Calculated loss: {loss}")
    return cls.event(EventType.LOSS_CALCULATED, loss=loss, **kwargs)

optim_post_step `classmethod`

optim_post_step(**kwargs) -> ModifiedState

Invoke an optimizer post-step event for the active session

Parameters:

kwargs –

additional kwargs to pass to the current session's event method

Returns:

ModifiedState –

the modified state of the active session after invoking the event

Source code in src/llmcompressor/core/session_functions.py

@classmethod
def optim_post_step(cls, **kwargs) -> ModifiedState:
    """
    Invoke an optimizer post-step event for the active session

    :param kwargs: additional kwargs to pass to the current session's event method
    :return: the modified state of the active session after invoking the event
    """
    return cls.event(EventType.OPTIM_POST_STEP, **kwargs)

optim_pre_step `classmethod`

optim_pre_step(**kwargs) -> ModifiedState

Invoke an optimizer pre-step event for the active session

Parameters:

kwargs –

additional kwargs to pass to the current session's event method

Returns:

ModifiedState –

the modified state of the active session after invoking the event

Source code in src/llmcompressor/core/session_functions.py

@classmethod
def optim_pre_step(cls, **kwargs) -> ModifiedState:
    """
    Invoke an optimizer pre-step event for the active session

    :param kwargs: additional kwargs to pass to the current session's event method
    :return: the modified state of the active session after invoking the event
    """
    return cls.event(EventType.OPTIM_PRE_STEP, **kwargs)

sequential_epoch_end `classmethod`

sequential_epoch_end(
    modules: list[Module], **kwargs
) -> ModifiedState

Invoke a sequential epoch end event for the active session. This event should be called after one sequential layer has been calibrated/trained for one epoch

This is called after a sequential layer has been calibrated with one batch, see src/llmcompressor/pipelines/sequential/pipeline.py for usage example

Source code in src/llmcompressor/core/session_functions.py

@classmethod
def sequential_epoch_end(cls, modules: list["Module"], **kwargs) -> ModifiedState:
    """
    Invoke a sequential epoch end event for the active session. This event should be
    called after one sequential layer has been calibrated/trained for one epoch

    This is called after a sequential layer has been calibrated with one batch, see
    `src/llmcompressor/pipelines/sequential/pipeline.py` for usage example
    """
    return cls.event(EventType.SEQUENTIAL_EPOCH_END, modules=modules, **kwargs)

ModelParameterizedLayer `dataclass`

ModelParameterizedLayer(
    layer_name: str, layer: Any, param_name: str, param: Any
)

A dataclass for holding a parameter and its layer

Parameters:

layer_name (str) –

the name of the layer
layer (Any) –

the layer object
param_name (str) –

the name of the parameter
param (Any) –

the parameter object

ModifiedState `dataclass`

ModifiedState(model, optimizer, loss, modifier_data)

A dataclass to represent a modified model, optimizer, and loss.

Parameters:

model (Optional[Any]) –

The modified model
optimizer (Optional[Any]) –

The modified optimizer
loss (Optional[Any]) –

The modified loss
modifier_data (Optional[List[Dict[str, Any]]]) –

The modifier data used to modify the model, optimizer, and loss

Initialize the ModifiedState with the given parameters.

Parameters:

model (Any) –

The modified model
optimizer (Any) –

The modified optimizer
loss (Any) –

The modified loss
modifier_data (List[Dict[str, Any]]) –

The modifier data used to modify the model, optimizer, and loss

Source code in src/llmcompressor/core/state.py

def __init__(self, model, optimizer, loss, modifier_data):
    """
    Initialize the ModifiedState with the given parameters.

    :param model: The modified model
    :type model: Any
    :param optimizer: The modified optimizer
    :type optimizer: Any
    :param loss: The modified loss
    :type loss: Any
    :param modifier_data: The modifier data used to modify the model, optimizer,
        and loss
    :type modifier_data: List[Dict[str, Any]]
    """
    self.model = model
    self.optimizer = optimizer
    self.loss = loss
    self.modifier_data = modifier_data

State `dataclass`

State(
    model: Any = None,
    teacher_model: Any = None,
    optimizer: Any = None,
    optim_wrapped: bool = None,
    loss: Any = None,
    batch_data: Any = None,
    data: Data = Data(),
    hardware: Hardware = Hardware(),
    loss_masks: list[Tensor] | None = None,
    current_batch_idx: int = -1,
    sequential_prefetch: bool = False,
)

State class holds information about the current compression state.

Parameters:

model (Any, default: None ) –

The model being used for compression
teacher_model (Any, default: None ) –

The teacher model being used for compression
optimizer (Any, default: None ) –

The optimizer being used for training
optim_wrapped (bool, default: None ) –

Whether or not the optimizer has been wrapped
loss (Any, default: None ) –

The loss function being used for training
batch_data (Any, default: None ) –

The current batch of data being used for compression
data (Data, default: Data() ) –

The data sets being used for training, validation, testing, and/or calibration, wrapped in a Data instance
hardware (Hardware, default: Hardware() ) –

Hardware instance holding info about the target hardware being used

Methods:

update –

Update the state with the given parameters.

Attributes:

compression_ready (bool) –

Check if the model and optimizer are set for compression.

compression_ready `property`

compression_ready: bool

Check if the model and optimizer are set for compression.

Returns:

bool –

True if model and optimizer are set, False otherwise

update

update(
    model: Any = None,
    teacher_model: Any = None,
    optimizer: Any = None,
    attach_optim_callbacks: bool = True,
    train_data: Any = None,
    val_data: Any = None,
    test_data: Any = None,
    calib_data: Any = None,
    copy_data: bool = True,
    start: float = None,
    steps_per_epoch: int = None,
    batches_per_step: int = None,
    **kwargs,
) -> dict

Update the state with the given parameters.

Parameters:

model (Any, default: None ) –

The model to update the state with
teacher_model (Any, default: None ) –

The teacher model to update the state with
optimizer (Any, default: None ) –

The optimizer to update the state with
attach_optim_callbacks (bool, default: True ) –

Whether or not to attach optimizer callbacks
train_data (Any, default: None ) –

The training data to update the state with
val_data (Any, default: None ) –

The validation data to update the state with
test_data (Any, default: None ) –

The testing data to update the state with
calib_data (Any, default: None ) –

The calibration data to update the state with
copy_data (bool, default: True ) –

Whether or not to copy the data
start (float, default: None ) –

The start index to update the state with
steps_per_epoch (int, default: None ) –

The steps per epoch to update the state with
batches_per_step (int, default: None ) –

The batches per step to update the state with
kwargs –

Additional keyword arguments to update the state with

Returns:

Dict –

The updated state as a dictionary

Source code in src/llmcompressor/core/state.py

def update(
    self,
    model: Any = None,
    teacher_model: Any = None,
    optimizer: Any = None,
    attach_optim_callbacks: bool = True,
    train_data: Any = None,
    val_data: Any = None,
    test_data: Any = None,
    calib_data: Any = None,
    copy_data: bool = True,
    start: float = None,
    steps_per_epoch: int = None,
    batches_per_step: int = None,
    **kwargs,
) -> dict:
    """
    Update the state with the given parameters.

    :param model: The model to update the state with
    :type model: Any
    :param teacher_model: The teacher model to update the state with
    :type teacher_model: Any
    :param optimizer: The optimizer to update the state with
    :type optimizer: Any
    :param attach_optim_callbacks: Whether or not to attach optimizer callbacks
    :type attach_optim_callbacks: bool
    :param train_data: The training data to update the state with
    :type train_data: Any
    :param val_data: The validation data to update the state with
    :type val_data: Any
    :param test_data: The testing data to update the state with
    :type test_data: Any
    :param calib_data: The calibration data to update the state with
    :type calib_data: Any
    :param copy_data: Whether or not to copy the data
    :type copy_data: bool
    :param start: The start index to update the state with
    :type start: float
    :param steps_per_epoch: The steps per epoch to update the state with
    :type steps_per_epoch: int
    :param batches_per_step: The batches per step to update the state with
    :type batches_per_step: int
    :param kwargs: Additional keyword arguments to update the state with
    :return: The updated state as a dictionary
    :rtype: Dict
    """
    logger.debug(
        "Updating state with provided parameters: {}",
        {
            "model": model,
            "teacher_model": teacher_model,
            "optimizer": optimizer,
            "attach_optim_callbacks": attach_optim_callbacks,
            "train_data": train_data,
            "val_data": val_data,
            "test_data": test_data,
            "calib_data": calib_data,
            "copy_data": copy_data,
            "start": start,
            "steps_per_epoch": steps_per_epoch,
            "batches_per_step": batches_per_step,
            "kwargs": kwargs,
        },
    )

    if model is not None:
        self.model = model
    if teacher_model is not None:
        self.teacher_model = teacher_model
    if optimizer is not None:
        self.optim_wrapped = attach_optim_callbacks
        self.optimizer = optimizer
    if train_data is not None:
        self.data.train = train_data if not copy_data else deepcopy(train_data)
    if val_data is not None:
        self.data.val = val_data if not copy_data else deepcopy(val_data)
    if test_data is not None:
        self.data.test = test_data if not copy_data else deepcopy(test_data)
    if calib_data is not None:
        self.data.calib = calib_data if not copy_data else deepcopy(calib_data)

    if "device" in kwargs:
        self.hardware.device = kwargs["device"]

    return kwargs

active_session

active_session() -> CompressionSession

Returns:

CompressionSession –

the active session for sparsification

Source code in src/llmcompressor/core/session_functions.py

def active_session() -> CompressionSession:
    """
    :return: the active session for sparsification
    """
    global _local_storage
    return getattr(_local_storage, "session", _global_session)

create_session

create_session() -> (
    Generator[CompressionSession, None, None]
)

Context manager to create and yield a new session. This will set the active session to the new session for the duration of the context.

Returns:

Generator[CompressionSession, None, None] –

the new session

Source code in src/llmcompressor/core/session_functions.py

@contextmanager
def create_session() -> Generator[CompressionSession, None, None]:
    """
    Context manager to create and yield a new session.
    This will set the active session to the new session for the duration
    of the context.

    :return: the new session
    """
    global _local_storage
    orig_session = getattr(_local_storage, "session", None)
    new_session = CompressionSession()
    _local_storage.session = new_session
    try:
        yield new_session
    finally:
        _local_storage.session = orig_session

reset_session

reset_session()

Reset the currently active session to its initial state

Source code in src/llmcompressor/core/session_functions.py

def reset_session():
    """
    Reset the currently active session to its initial state
    """
    session = active_session()
    session._lifecycle.reset()

llmcompressor.core

CompressionLifecycle dataclass

event

finalize

initialize

reset

CompressionSession

lifecycle property

state property

event

finalize

get_serialized_recipe

initialize

reset

reset_stage

Data dataclass

Event dataclass

current_index property writable

epoch property

epoch_based property

epoch_batch property

epoch_full property

epoch_step property

new_instance

should_update

EventType

Hardware dataclass

LifecycleCallbacks

batch_end classmethod

batch_start classmethod

calibration_epoch_end classmethod

calibration_epoch_start classmethod

event classmethod

loss_calculated classmethod

optim_post_step classmethod

optim_pre_step classmethod

sequential_epoch_end classmethod

ModelParameterizedLayer dataclass

ModifiedState dataclass

State dataclass

compression_ready property

update

active_session

create_session

reset_session

CompressionLifecycle `dataclass`

lifecycle `property`

state `property`

Data `dataclass`

Event `dataclass`

current_index `property` `writable`

epoch `property`

epoch_based `property`

epoch_batch `property`

epoch_full `property`

epoch_step `property`

Hardware `dataclass`

batch_end `classmethod`

batch_start `classmethod`

calibration_epoch_end `classmethod`

calibration_epoch_start `classmethod`

event `classmethod`

loss_calculated `classmethod`

optim_post_step `classmethod`

optim_pre_step `classmethod`

sequential_epoch_end `classmethod`

ModelParameterizedLayer `dataclass`

ModifiedState `dataclass`

State `dataclass`

compression_ready `property`