Router Logic#

class vllm_router.routers.routing_logic.SessionRouter(*args, **kwargs)#

Bases: RoutingInterface

Route the request to the appropriate engine URL based on the session key in the request headers

route_request(endpoints: List[EndpointInfo], engine_stats: Dict[str, EngineStats], request_stats: Dict[str, RequestStats], request: fastapi.Request) str#

Route the request to the appropriate engine URL by the ‘session id’ in the request headers. If there is no session id in the request header, it will pick a server with lowest qps

Parameters:
  • endpoints (List[EndpointInfo]) – The list of engine URLs

  • engine_stats (Dict[str, EngineStats]) – The engine stats indicating the ‘physical’ load of each engine

  • request_stats (Dict[str, RequestStats]) – The request stats indicating the request-level performance of each engine

  • request (Request) – The incoming request

class vllm_router.routers.routing_logic.RoundRobinRouter(*args, **kwargs)#

Bases: RoutingInterface

route_request(endpoints: List[EndpointInfo], engine_stats: Dict[str, EngineStats], request_stats: Dict[str, RequestStats], request: fastapi.Request) str#

Route the request to the appropriate engine URL using a simple round-robin algorithm

Parameters:
  • endpoints (List[EndpointInfo]) – The list of engine URLs

  • engine_stats (Dict[str, EngineStats]) – The engine stats indicating the ‘physical’ load of each engine

  • request_stats (Dict[str, RequestStats]) – The request stats indicating the request-level performance of each engine

  • request (Request) – The incoming request