Skip to content

vllm_omni.distributed.kv_transfer.monkey_patch

Monkey-patch vLLM's MooncakeConnector to fix request-ID mismatch in PD disaggregation.

vLLM's InputProcessor appends a random suffix to each request ID. The prefill engine stores KV under its suffix, but the decode engine generates a different suffix. This patch threads remote_request_id through kv_transfer_params so the decode side references the correct KV entry.

logger module-attribute

logger = getLogger(__name__)

PatchedRecvReqMeta dataclass

Receive-request metadata carrying the prefill engine's request ID.

kv_transfer_params instance-attribute

kv_transfer_params: dict[str, Any]

local_block_ids instance-attribute

local_block_ids: list[int]

remote_request_id instance-attribute

remote_request_id: str

request_id instance-attribute

request_id: str

apply_mooncake_connector_patch

apply_mooncake_connector_patch() -> bool

Replace vLLM's MooncakeConnector with the patched version.