Skip to content

vllm_omni.diffusion.cache.base

Base cache backend interface for diffusion models.

This module defines the abstract base class that all cache backends must implement. Cache backends provide a unified interface for applying different caching strategies to transformer models.

Main cache backend implementations: 1. CacheDiTBackend: Implements cache-dit acceleration (DBCache, SCM, TaylorSeer) using the cache-dit library. Inherits from CacheBackend. Used via cache_backend="cache_dit". 2. TeaCacheBackend: Hook-based backend for TeaCache acceleration. Inherits from CacheBackend. Used via cache_backend="tea_cache".

All backends implement the same interface: - enable(pipeline): Enable cache on the pipeline - refresh(pipeline, num_inference_steps, verbose): Refresh cache state - is_enabled(): Check if cache is enabled

CacheBackend

Bases: ABC

Abstract base class for cache backends.

All cache backend implementations (CacheDiTBackend, TeaCacheBackend, etc.) inherit from this base class and implement the enable() and refresh() methods to manage cache lifecycle.

Cache backends apply caching strategies to transformer models to accelerate inference. Different backends use different underlying mechanisms (e.g., cache-dit library for CacheDiTBackend, hooks for TeaCacheBackend), but all share the same unified interface.

Attributes:

Name Type Description
config

DiffusionCacheConfig instance containing cache-specific configuration parameters

enabled

Boolean flag indicating whether cache is enabled (set to True after enable() is called)

config instance-attribute

config = config

enabled instance-attribute

enabled = False

enable abstractmethod

enable(pipeline: Any) -> None

Enable cache on the pipeline.

This method applies the caching strategy to the transformer(s) in the pipeline. The specific implementation depends on the backend (e.g., hooks for TeaCacheBackend, cache-dit library for CacheDiTBackend). Called once during pipeline initialization.

Parameters:

Name Type Description Default
pipeline Any

Diffusion pipeline instance. The backend can extract: - transformer: via pipeline.transformer - model_type: via pipeline.class.name

required

is_enabled

is_enabled() -> bool

Check if cache is enabled on this backend.

Returns:

Type Description
bool

True if cache is enabled, False otherwise.

refresh abstractmethod

refresh(
    pipeline: Any,
    num_inference_steps: int,
    verbose: bool = True,
) -> None

Refresh cache state for new generation.

This method should clear any cached values and reset counters/accumulators. Called at the start of each generation to ensure clean state.

Parameters:

Name Type Description Default
pipeline Any

Diffusion pipeline instance. The backend can extract: - transformer: via pipeline.transformer

required
num_inference_steps int

Number of inference steps for the current generation. May be used for cache context updates.

required
verbose bool

Whether to log refresh operations (default: True)

True

CachedTransformer

Bases: Module

do_true_cfg instance-attribute

do_true_cfg = False