vllm_omni.diffusion.attention.backends.sage_attn ¶
SageAttentionBackend ¶
Bases: AttentionBackend
SageAttentionImpl ¶
Bases: AttentionImpl
forward_cuda ¶
forward_cuda(
query: Tensor,
key: Tensor,
value: Tensor,
attn_metadata: AttentionMetadata = None,
) -> Tensor