Router Logic#
- class vllm_router.routers.routing_logic.SessionRouter(*args, **kwargs)#
Bases:
RoutingInterfaceRoute the request to the appropriate engine URL based on the session key in the request headers
- route_request(endpoints: List[EndpointInfo], engine_stats: Dict[str, EngineStats], request_stats: Dict[str, RequestStats], request: fastapi.Request) str#
Route the request to the appropriate engine URL by the ‘session id’ in the request headers. If there is no session id in the request header, it will pick a server with lowest qps
- Parameters:
endpoints (List[EndpointInfo]) – The list of engine URLs
engine_stats (Dict[str, EngineStats]) – The engine stats indicating the ‘physical’ load of each engine
request_stats (Dict[str, RequestStats]) – The request stats indicating the request-level performance of each engine
request (Request) – The incoming request
- class vllm_router.routers.routing_logic.RoundRobinRouter(*args, **kwargs)#
Bases:
RoutingInterface- route_request(endpoints: List[EndpointInfo], engine_stats: Dict[str, EngineStats], request_stats: Dict[str, RequestStats], request: fastapi.Request) str#
Route the request to the appropriate engine URL using a simple round-robin algorithm
- Parameters:
endpoints (List[EndpointInfo]) – The list of engine URLs
engine_stats (Dict[str, EngineStats]) – The engine stats indicating the ‘physical’ load of each engine
request_stats (Dict[str, RequestStats]) – The request stats indicating the request-level performance of each engine
request (Request) – The incoming request