Tutorials
Step-by-step tutorials to guide you through complete workflows, from data preparation to serving trained models in production.
Serve in vLLM
Deploy your trained speculator models in vLLM for production inference.
Time required: ~5 minutes
Train Eagle-3 Model Online
Learn how to train an Eagle-3 speculator using online training, where hidden states are generated on-demand during training.
Time required: ~30 mins
Train Eagle-3 Model Offline
Learn how to train an Eagle-3 speculator using offline training with pre-generated hidden states.
Time required: ~3 hours
Train DFlash Model Online
COMING SOON
Learn how to train a DFlash speculator model with block-based token generation.
Response Regeneration
Regenerate dataset responses using your target model for improved drafter alignment.
Time required: ~10 minutes
Evaluating Model Performance
COMING SOON
Benchmark and evaluate your trained speculator models.