vllm_omni.diffusion.forward_context ¶
ForwardContext dataclass ¶
set forward context for diffusion models
attn_metadata class-attribute instance-attribute ¶
attn_metadata: (
dict[str, AttentionMetadata]
| list[dict[str, AttentionMetadata]]
| None
) = None
omni_diffusion_config class-attribute instance-attribute ¶
omni_diffusion_config: OmniDiffusionConfig | None = None
sp_active property ¶
sp_active: bool
Returns True when SP attention parallelism should be enabled.
- If _sp_plan hooks are applied: use _sp_shard_depth (0 = outside sharded region).
- If _sp_plan hooks are NOT applied: default to True when sequence_parallel_size > 1, since _sp_shard_depth is only meaningful within the _sp_plan hook mechanism.
create_forward_context ¶
create_forward_context(
vllm_config: VllmConfig | None = None,
omni_diffusion_config: OmniDiffusionConfig
| None = None,
attn_metadata: dict[str, AttentionMetadata]
| list[dict[str, AttentionMetadata]]
| None = None,
split_text_embed_in_sp: bool = False,
denoise_step_idx: int | None = None,
)
get_ulysses_mode ¶
Resolve the Ulysses-SP mode from the current ForwardContext.
Returns default when ForwardContext is unavailable or the diffusion config is not set.
override_forward_context ¶
override_forward_context(
forward_context: ForwardContext | None,
)
A context manager that overrides the current forward context. This is used to override the forward context for a specific forward pass.
set_forward_context ¶
set_forward_context(
vllm_config: VllmConfig | None = None,
omni_diffusion_config: OmniDiffusionConfig
| None = None,
attn_metadata: dict[str, AttentionMetadata]
| list[dict[str, AttentionMetadata]]
| None = None,
split_text_embed_in_sp: bool = False,
denoise_step_idx: int | None = None,
)
A context manager that stores the current forward context, can be attention metadata, split_text_embed_in_sp, etc. Here we can inject common logic for every model forward pass.
set_forward_context_denoise_step_idx ¶
set_forward_context_denoise_step_idx(
step_idx: int | None,
) -> None
Set the current diffusion denoise step on the active ForwardContext.
set_forward_context_ref_latent ¶
Set the per-request reference latent on the active ForwardContext.
Used by img2img-capable DiT models (e.g. Ming-flash-omni-2.0) so the transformer can read the reference latent from request scope instead of module instance state.