CRD based configuration (recommended)#
This is the controller for the Router CRD. It is responsible for creating and updating the ConfigMap for the vllm_router.
Description#
The Router Controller is a Kubernetes controller that manages the Router custom resource. It watches for Router resources and creates a ConfigMap for the vllm_router.
Getting Started#
Prerequisites#
go version v1.22.0+
docker version 17.03+.
kubectl version v1.11.3+.
Access to a Kubernetes v1.11.3+ cluster.
To Deploy on the cluster#
Build and push your image to the location specified by IMG:
make docker-build docker-push IMG=<some-registry>/router-controller:tag
Note
This image ought to be published in the personal registry you specified. And it is required to have access to pull the image from the working environment. Make sure you have the proper permission to the registry if the above commands don’t work.
Install the CRDs into the cluster:
make install
Deploy the Manager to the cluster with the image specified by IMG:
make deploy IMG=<some-registry>/router-controller:tag
Note
If you encounter RBAC errors, you may need to grant yourself cluster-admin privileges or be logged in as admin.
Create instances of your solution You can apply the samples (examples) from the config/sample:
kubectl apply -k config/samples/
Note
Ensure that the samples has default values to test it out.
To Uninstall#
Delete the instances (CRs) from the cluster:
kubectl delete -k config/samples/
Delete the APIs(CRDs) from the cluster:
make uninstall
Delete the Manager from the cluster:
make undeploy
Project Distribution#
Build the installer#
Build the installer for the image built and published in the registry:
make build-installer IMG=<some-registry>/router-controller:tag
Note
The makefile target mentioned above generates an install.yaml file in the dist directory. This file contains all the resources built with Kustomize, which are necessary to install this project without its dependencies.
Using the installer
Users can just run kubectl apply -f to install the project, i.e.:
kubectl apply -f https://raw.githubusercontent.com/<org>/router-controller/<tag or branch>/dist/install.yaml
StaticRoute CRD#
The StaticRoute CRD allows you to configure the vllm_router with static backends and models. The controller reads the CRD and creates a ConfigMap that can be used by the vllm_router with the --dynamic-config-json option.
One example is shown below:
apiVersion: production-stack.vllm.ai/v1alpha1
kind: StaticRoute
metadata:
name: staticroute-sample
spec:
# Service discovery method
serviceDiscovery: static
# Routing logic
routingLogic: roundrobin
# Comma-separated list of backend URLs
staticBackends: "http://localhost:9001,http://localhost:9002,http://localhost:9003"
# Comma-separated list of model names
staticModels: "facebook/opt-125m,meta-llama/Llama-3.1-8B-Instruct,facebook/opt-125m"
# Name of the vllm_router to configure
routerRef:
kind: Service
apiVersion: v1
name: vllm-router
namespace: default
# Optional: Name of the ConfigMap to create
configMapName: vllm-router-config
How it works#
The controller watches for StaticRoute resources.
When a StaticRoute is created or updated, the controller creates or updates a ConfigMap with the dynamic configuration.
The ConfigMap contains a
dynamic_config.jsonfile with the following structure:
{
"service_discovery": "static",
"routing_logic": "roundrobin",
"static_backends": "http://localhost:9001,http://localhost:9002,http://localhost:9003",
"static_models": "facebook/opt-125m,meta-llama/Llama-3.1-8B-Instruct,facebook/opt-125m"
}
The controller checks the health endpoint of the vllm_router services that match the routerSelector to verify that the configuration is valid.
The vllm_router should be configured to use the ConfigMap with the
--dynamic-config-jsonoption:
containers:
- name: vllm-router
image: vllm-router:latest
args:
- "--dynamic-config-json /etc/vllm-router/dynamic_config.json"
volumeMounts:
- name: config-volume
mountPath: /etc/vllm-router
volumes:
- name: config-volume
configMap:
name: vllm-router-config
Check Status#
The StaticRoute resource has the following status fields:
configMapRef: The name of the ConfigMap that was created.lastAppliedTime: The time when the configuration was last applied.conditions: A list of conditions that represent the latest available observations of the StaticRoute’s state.