vime Documentation#

vime is an LLM post-training framework for RL scaling, providing two core capabilities:

High-Performance Training: Supports efficient training in various modes by connecting Megatron with vLLM;
Flexible Data Generation: Enables arbitrary training data generation workflows through custom data generation interfaces and server-based engines.

vime is built on slime, the RL framework behind GLM-4.7, GLM-4.6 and GLM-4.5. vime keeps slime’s training stack and data-generation design while using vLLM as the default rollout backend, and inherits broad model support from slime, including:

Qwen3 series (Qwen3Next, Qwen3MoE, Qwen3), Qwen2.5 series;
DeepSeek V3 series (DeepSeek V3, V3.1, DeepSeek R1);
Llama 3.

Start by Use Case#

New to vime: Quick Start
Configure training and rollout arguments: Usage Guide
Add custom generation, reward, or rollout functions: Customization Guide
Build agentic RL workflows: Agentic RL Training Roadmap
Configure production vLLM rollout topology: vLLM Config: Advanced Engine Deployment
Connect external rollout engines: External Rollout Engines Roadmap
Sync weights as byte-level deltas: Delta Weight Sync
Use PD disaggregation: PD Disaggregation
Use BF16 training with FP8 rollout or FP8 KV cache: Low Precision Training and Rollout
Understand CI and reliability coverage: CI (Continuous Integration)
Debug, trace, and profile long-running jobs: Debugging, Trace Viewer, Profiling

Get Started

Dense

MoE

Advanced Features

Other Usage

Developer Guide

Hardware Platforms

vime Documentation

Contents

vime Documentation#

Start by Use Case#