vllm_omni.diffusion.models.interface ¶
SupportAudioInput ¶
SupportAudioOutput ¶
SupportImageInput ¶
SupportsComponentDiscovery ¶
Bases: Protocol
Declares which submodules serve as pipeline components.
Used by the framework to locate DiT, encoder, and VAE modules for CPU offload, HSDP sharding, and other operations that need to know the pipeline's internal structure.
All attribute names support dotted paths for nested submodules (e.g. "pipe.transformer").
Attributes:
| Name | Type | Description |
|---|---|---|
_dit_modules | list[str] | Denoising submodules (on GPU during diffusion). |
_encoder_modules | list[str] | Encoder submodules (offloaded during diffusion). |
_vae_modules | list[str] | VAE(s) (always on GPU). |
_resident_modules | list[str] | Extra modules pinned on GPU during layerwise offloading. Optional, defaults to |
SupportsStepExecution ¶
Bases: Protocol
State-driven step-level execution protocol for diffusion pipelines.
Pipelines should split request-level forward() into: prepare_encode() (one-time request setup), denoise_step() (one denoise forward), step_scheduler() (one scheduler update), and post_decode() (final decode).
denoise_step ¶
denoise_step(
state: DiffusionRequestState, **kwargs: Any
) -> Tensor | None
Run one denoise step.
post_decode ¶
post_decode(
state: DiffusionRequestState, **kwargs: Any
) -> DiffusionOutput
Decode output after denoise loop.
prepare_encode ¶
prepare_encode(
state: DiffusionRequestState, **kwargs: Any
) -> DiffusionRequestState
Prepare request-level inputs and return initialized state.
step_scheduler ¶
step_scheduler(
state: DiffusionRequestState,
noise_pred: Tensor,
**kwargs: Any,
) -> None
Run one scheduler step.