vllm.v1.attention.backends.mla ¶
Modules:
-
flashinfer_mla_sparse–FlashInfer sparse MLA attention backend.
-
flashinfer_mla_sparse_sm120–SM120 implementation variant for
FLASHINFER_MLA_SPARSE_SM120. -
flashmla_sparse– -
indexer– -
prefill– -
rocm_aiter_mla– -
rocm_aiter_mla_sparse– -
sparse_swa– -
sparse_utils–Utility functions for sparse MLA backends.
-
tokenspeed_mla–TokenSpeed CuTe DSL MLA decode backend (Blackwell, FP8 KV cache only).