Azure Kubernetes Service#
Introduction#
This script automatically configures an AKS LLM inference cluster.
Make sure your Azure CLI is installed, logged in, and the region is properly configured.
You must have the following dependencies installed:
az(Azure Command-Line Interface)kubectl(Kubernetes command-line tool)helm(Kubernetes package manager)
Additionally, ensure that the following Azure services are set up:
Resource Groupsfor managing resourcesAKScluster with proper networking configurationAzure FilesorAzure Managed Disksfor persistent storage
Steps to Follow#
1. Deploy AKS vLLM Stack#
1.1 Modify the Configuration#
Before running the deployment, ensure that the configuration file production_stack_specification.yaml is properly set up.
You need to configure:
servingEngineSpec: Define the model repository, resource requests, and storage settings.routerSpec: Set up routing resource limits and requests.Persistent Storage: If usingAzure Files, ensure that the persistent volume configuration matches your storage needs.
Modify these fields as needed to match your cluster requirements.
1.2 Execute the Deployment Script#
Run the deployment script by replacing RESOURCE_GROUP and YAML_FILE_PATH with the actual values:
bash entry_point.sh setup RESOURCE_GROUP YAML_FILE_PATH
After executing the script, Kubernetes will start deploying the vLLM inference stack.
You can monitor the status of the deployment.
2. Validate Installation#
2.1 Monitor Deployment Status#
To check whether the pods for vLLM deployment are up and running, use:
kubectl get pods
Expected output:
NAME READY STATUS RESTARTS AGE
vllm-deployment-router-69b7f9748d-xrkvn 1/1 Running 0 75s
vllm-opt125m-deployment-vllm-696c998c6f-mvhg4 1/1 Running 0 75s
Note
It may take some time for the pods to reach the Running state, depending on cluster setup and image download speed.
2.2 Inspect Pod Logs#
If a pod is not transitioning to Running, use the following command to inspect logs:
kubectl logs -f <POD_NAME>
To get more detailed information about the pod, run:
kubectl describe pod <POD_NAME>
3. Persistent Storage Considerations#
If using Azure Files or Azure Managed Disks for storage, keep in mind:
Azure Filesmust be mounted to theAKScluster as a persistent volume.The storage account must be in the same region as the
AKScluster.The
AKSnode pool should have the appropriate permissions to accessAzure Files.Ensure that the
RBACpolicies are correctly set up forAzure CSIdriver operation.
If you need to manually delete storage resources, you can do so via the Azure Portal or using Azure CLI commands.
4. Uninstall#
To remove the deployed vLLM stack and clean up resources, run:
bash entry_point.sh cleanup RESOURCE_GROUP
You may also need to manually delete the resource group and clean up any remaining resources via the Azure Portal.
5. Troubleshooting#
If you encounter issues, refer to the following solutions:
Pods stuck in
Pendingstate: Check available resources and ensure that the cluster has enough nodes:kubectl describe nodes
Pods in
CrashLoopBackOffstate: Inspect logs to find the issue:kubectl logs <POD_NAME>
Cannot connect to
AKScluster: Ensure that yourAzure CLIis properly configured:az aks get-credentials --resource-group <RESOURCE_GROUP> --name <CLUSTER_NAME>
Following these steps should help ensure a successful deployment.