Skip to content

vllm_omni.diffusion.cache.teacache.config

TeaCacheConfig dataclass

Configuration for TeaCache applied to transformer models.

TeaCache (Timestep Embedding Aware Cache) is an adaptive caching technique that speeds up diffusion model inference by reusing transformer block computations when consecutive timestep embeddings are similar.

Parameters:

Name Type Description Default
rel_l1_thresh float

Threshold for accumulated relative L1 distance. When below threshold, cached residual is reused. Values in [0.1, 0.3] work best: - 0.2: ~1.5x speedup with minimal quality loss - 0.4: ~1.8x speedup with slight quality loss - 0.6: ~2.0x speedup with noticeable quality loss

0.2
coefficients list[float] | None

Polynomial coefficients for rescaling L1 distance. If None, uses model-specific defaults based on transformer_type.

None
transformer_type str

Transformer class name (e.g., "QwenImageTransformer2DModel"). Auto-detected from pipeline.transformer.class.name in backend. Defaults to "QwenImageTransformer2DModel".

'QwenImageTransformer2DModel'

coefficients class-attribute instance-attribute

coefficients: list[float] | None = None

rel_l1_thresh class-attribute instance-attribute

rel_l1_thresh: float = 0.2

transformer_type class-attribute instance-attribute

transformer_type: str = 'QwenImageTransformer2DModel'