`vllm.v1.attention.backends.mla` ¶

Modules:

flashinfer_mla_sparse –

FlashInfer sparse MLA attention backend.
flashinfer_mla_sparse_sm120 –

SM120 implementation variant for FLASHINFER_MLA_SPARSE_SM120.
flashmla_sparse –
indexer –
prefill –
rocm_aiter_mla –
rocm_aiter_mla_sparse –
sparse_swa –
sparse_utils –

Utility functions for sparse MLA backends.
tokenspeed_mla –

TokenSpeed CuTe DSL MLA decode backend (Blackwell, FP8 KV cache only).
triton_mla –

vllm.v1.attention.backends.mla ¶