vllm_omni.diffusion.models.diffusers_adapter.pipeline_diffusers_adapter ¶
Diffusers backend adapter for vLLM-Omni.
Provides a black-box wrapper around any 🤗 Diffusers pipeline, enabling vLLM-Omni to directly serve Diffusers models with near-zero per-model code.
The adapter delegates full pipeline execution to diffusers' __call__(). It does NOT support: - CFG parallel (diffusers handles CFG via guidance_scale internally) - Sequence parallel (requires model-specific attention surgery) - TeaCache / Cache-DiT (requires hooking into transformer blocks) - Step-wise execution (continuous batching)
DiffusersAdapterPipeline ¶
Bases: Module, DiffusionPipelineProfilerMixin
Black-box adapter that delegates full pipeline execution to a diffusers pipeline.
Usage::
adapter = DiffusersAdapterPipeline(od_config=od_config)
adapter.load_weights() # calls DiffusionPipeline.from_pretrained()
output = adapter.forward(req)
Step-wise execution is explicitly rejected — diffusers encapsulates the full denoising loop internally. Use native pipelines for continuous batching mode.
forward ¶
forward(req: OmniDiffusionRequest) -> DiffusionOutput
Full delegation to diffusers pipeline.__call__().
load_weights ¶
Load the diffusers pipeline via DiffusionPipeline.from_pretrained().