Skip to content

vllm_omni.diffusion.cache.teacache.config ¶

TeaCacheConfig `dataclass` ¶

Configuration for TeaCache applied to transformer models.

TeaCache (Timestep Embedding Aware Cache) is an adaptive caching technique that speeds up diffusion model inference by reusing transformer block computations when consecutive timestep embeddings are similar.

Parameters:

Name	Type	Description	Default
`rel_l1_thresh`	`float`	Threshold for accumulated relative L1 distance. When below threshold, cached residual is reused. Values in [0.1, 0.3] work best: - 0.2: ~1.5x speedup with minimal quality loss - 0.4: ~1.8x speedup with slight quality loss - 0.6: ~2.0x speedup with noticeable quality loss	`0.2`
`coefficients`	`list[float] \| None`	Polynomial coefficients for rescaling L1 distance. If None, uses model-specific defaults based on transformer_type.	`None`
`transformer_type`	`str`	Transformer class name (e.g., "QwenImageTransformer2DModel"). Auto-detected from pipeline.transformer.class.name in backend. Defaults to "QwenImageTransformer2DModel".	`'QwenImageTransformer2DModel'`

coefficients `class-attribute` `instance-attribute` ¶

coefficients: list[float] | None = None

rel_l1_thresh `class-attribute` `instance-attribute` ¶

rel_l1_thresh: float = 0.2

transformer_type `class-attribute` `instance-attribute` ¶

transformer_type: str = 'QwenImageTransformer2DModel'