Skip to content

vllm_omni.diffusion.attention.parallel.ring

logger module-attribute

logger = init_logger(__name__)

RingParallelAttention

Ring sequence-parallel strategy.

This strategy prepares inputs for Ring Attention. Key responsibilities: - Concatenate joint_query (Text) to query (Image) if present. - Keep joint_key/value separate in metadata for the Ring kernel to handle as static prefix.

attn_backend_pref instance-attribute

attn_backend_pref = attn_backend_pref

enabled property

enabled: bool

name property

name: str

post_attention

post_attention(
    attn_output: Tensor,
    ctx: ParallelAttentionContext | None,
) -> Tensor

pre_attention

pre_attention(
    query: Tensor,
    key: Tensor,
    value: Tensor,
    attn_metadata: AttentionMetadata | None,
)

run_attention

run_attention(
    query: Tensor,
    key: Tensor,
    value: Tensor,
    attn_metadata: AttentionMetadata | None,
    softmax_scale: float | None = None,
    causal: bool = False,
) -> Tensor

Run the actual Ring Attention kernel.