Helios Video Generation via Online API¶
Source https://github.com/vllm-project/vllm-omni/tree/main/examples/online_serving/helios.
This example demonstrates how to use the /v1/videos API with Helios models using the generic extra_params field.
Overview¶
The /v1/videos API now supports model-specific parameters through the extra_params field, which accepts a JSON object containing any model-specific configuration. This allows supporting new models like Helios without modifying the API.
Helios Model Variants¶
- Helios-Base: Basic T2V/I2V/V2V generation (Stage 1 only)
- Helios-Mid: Advanced generation with CFG-Zero* (Stage 2)
- Helios-Distilled: Fast generation with DMD (Stage 2)
API Usage Examples¶
1. T2V (Text-to-Video) - Helios-Base¶
Basic text-to-video generation:
curl -X POST http://localhost:8000/v1/videos \
-F "prompt=A serene lakeside sunrise with mist over the water." \
-F "model=BestWishYsh/Helios-Base" \
-F "width=640" \
-F "height=384" \
-F "num_frames=99" \
-F "num_inference_steps=50" \
-F "guidance_scale=5.0" \
-F "seed=42"
2. I2V (Image-to-Video) - Helios-Base¶
Generate video from an input image:
curl -X POST http://localhost:8000/v1/videos \
-F "prompt=The lake water gently ripples as morning mist rises." \
-F "model=BestWishYsh/Helios-Base" \
-F "input_reference=@/path/to/image.jpg" \
-F "width=640" \
-F "height=384" \
-F "num_frames=99" \
-F "guidance_scale=5.0"
3. Helios-Mid with Stage 2 + CFG-Zero*¶
Advanced generation with pyramid multi-stage denoising:
curl -X POST http://localhost:8000/v1/videos \
-F "prompt=A serene lakeside sunrise with mist over the water." \
-F "model=BestWishYsh/Helios-Mid" \
-F "width=640" \
-F "height=384" \
-F "guidance_scale=5.0" \
-F 'extra_params={
"is_enable_stage2": true,
"pyramid_num_stages": 3,
"pyramid_num_inference_steps_list": [20, 20, 20],
"use_cfg_zero_star": true,
"use_zero_init": true,
"zero_steps": 1
}'
4. Helios-Distilled with DMD¶
Fast generation with Distribution Matching Distillation:
curl -X POST http://localhost:8000/v1/videos \
-F "prompt=A serene lakeside sunrise with mist over the water." \
-F "model=BestWishYsh/Helios-Distilled" \
-F "width=640" \
-F "height=384" \
-F "guidance_scale=1.0" \
-F 'extra_params={
"is_enable_stage2": true,
"pyramid_num_stages": 3,
"pyramid_num_inference_steps_list": [2, 2, 2],
"is_amplify_first_chunk": true
}'
Model-Specific Parameters¶
The extra_params field accepts a JSON object with model-specific parameters. For Helios models, supported parameters include:
Stage 2 Parameters¶
is_enable_stage2(bool): Enable pyramid multi-stage denoisingpyramid_num_stages(int): Number of pyramid stages (default: 3)pyramid_num_inference_steps_list(array): Steps per stage, e.g.,[20, 20, 20]
CFG Zero Star Parameters (Helios-Mid)¶
use_cfg_zero_star(bool): Enable CFG Zero Star guidanceuse_zero_init(bool): Use zero initialization for first stepszero_steps(int): Number of initial zero prediction steps (default: 1)
DMD Parameters (Helios-Distilled)¶
is_amplify_first_chunk(bool): Enable DMD amplification for first chunk
Video Input (V2V mode)¶
For video-to-video generation, upload a video file via input_reference:
curl -X POST http://localhost:8000/v1/videos \
-F "prompt=Transform the video into a watercolor painting style." \
-F "model=BestWishYsh/Helios-Base" \
-F "input_reference=@/path/to/video.mp4" \
-F "guidance_scale=5.0"
Python Example¶
import requests
import json
url = "http://localhost:8000/v1/videos"
# Helios-Mid with Stage 2
data = {
"prompt": "A serene lakeside sunrise with mist over the water.",
"model": "BestWishYsh/Helios-Mid",
"width": 640,
"height": 384,
"guidance_scale": 5.0,
"extra_params": json.dumps({
"is_enable_stage2": True,
"pyramid_num_stages": 3,
"pyramid_num_inference_steps_list": [20, 20, 20],
"use_cfg_zero_star": True,
"use_zero_init": True,
"zero_steps": 1
})
}
response = requests.post(url, data=data)
video_job = response.json()
print(f"Video job created: {video_job['id']}")
# Poll for completion
import time
while True:
status_response = requests.get(f"{url}/{video_job['id']}")
status = status_response.json()
if status['status'] == 'completed':
print(f"Video generated: {status['file_name']}")
break
elif status['status'] == 'failed':
print(f"Generation failed: {status.get('error')}")
break
print(f"Progress: {status['progress']}%")
time.sleep(2)
Benefits of This Approach¶
- No API Changes: New models can be supported without modifying the API
- Backward Compatible: Existing clients continue to work
- Flexible: Any model-specific parameter can be passed
- Type-Safe: Parameters are validated at the model level
- Future-Proof: Supports models that don't exist yet
Notes¶
- The
extra_paramsfield must be a valid JSON object - Parameters are passed directly to the model's
extra_args - Invalid parameters will be caught by the model implementation
- Tensor inputs (images/videos) should use
input_reference, notextra_params
Example materials¶
helios_client.py
#!/usr/bin/env python3
# SPDX-License-Identifier: Apache-2.0
# SPDX-FileCopyrightText: Copyright contributors to the vLLM project
"""
Helios video generation client example using the /v1/videos API.
This example demonstrates how to use the generic extra_params field
to pass Helios-specific parameters without modifying the API.
"""
import argparse
import json
import time
from pathlib import Path
import requests
def create_video_job(
api_url: str,
prompt: str,
model: str,
width: int = 640,
height: int = 384,
num_frames: int = 99,
guidance_scale: float = 5.0,
seed: int = 42,
extra_params: dict | None = None,
input_image: Path | None = None,
) -> dict:
"""Create a video generation job."""
data = {
"prompt": prompt,
"model": model,
"width": width,
"height": height,
"num_frames": num_frames,
"guidance_scale": guidance_scale,
"seed": seed,
}
files = {}
if input_image:
files["input_reference"] = open(input_image, "rb")
if extra_params:
data["extra_params"] = json.dumps(extra_params)
response = requests.post(f"{api_url}/v1/videos", data=data, files=files)
response.raise_for_status()
if files:
files["input_reference"].close()
return response.json()
def poll_video_status(api_url: str, video_id: str, poll_interval: int = 2) -> dict:
"""Poll video generation status until completion."""
print(f"Polling video job: {video_id}")
while True:
response = requests.get(f"{api_url}/v1/videos/{video_id}")
response.raise_for_status()
status_data = response.json()
status = status_data["status"]
progress = status_data.get("progress", 0)
print(f"Status: {status}, Progress: {progress}%")
if status == "completed":
print("Video generation completed!")
return status_data
elif status == "failed":
error = status_data.get("error", {})
raise RuntimeError(f"Video generation failed: {error}")
time.sleep(poll_interval)
def main():
parser = argparse.ArgumentParser(description="Helios video generation client")
parser.add_argument(
"--api-url",
default="http://localhost:8000",
help="API server URL",
)
parser.add_argument(
"--model",
default="BestWishYsh/Helios-Base",
help="Model name (Helios-Base, Helios-Mid, Helios-Distilled)",
)
parser.add_argument(
"--prompt",
default="A serene lakeside sunrise with mist over the water.",
help="Text prompt",
)
parser.add_argument(
"--width",
type=int,
default=640,
help="Video width",
)
parser.add_argument(
"--height",
type=int,
default=384,
help="Video height",
)
parser.add_argument(
"--num-frames",
type=int,
default=99,
help="Number of frames",
)
parser.add_argument(
"--guidance-scale",
type=float,
default=5.0,
help="Guidance scale",
)
parser.add_argument(
"--seed",
type=int,
default=42,
help="Random seed",
)
parser.add_argument(
"--input-image",
type=Path,
help="Input image for I2V mode",
)
parser.add_argument(
"--preset",
choices=["base", "mid-stage2", "distilled"],
default="base",
help="Helios preset configuration",
)
args = parser.parse_args()
# Define preset configurations
presets = {
"base": None, # No extra params for base model
"mid-stage2": {
"is_enable_stage2": True,
"pyramid_num_stages": 3,
"pyramid_num_inference_steps_list": [20, 20, 20],
"use_cfg_zero_star": True,
"use_zero_init": True,
"zero_steps": 1,
},
"distilled": {
"is_enable_stage2": True,
"pyramid_num_stages": 3,
"pyramid_num_inference_steps_list": [2, 2, 2],
"is_amplify_first_chunk": True,
},
}
extra_params = presets[args.preset]
print("=" * 50)
print("Helios Video Generation")
print("=" * 50)
print(f"API URL: {args.api_url}")
print(f"Model: {args.model}")
print(f"Preset: {args.preset}")
print(f"Prompt: {args.prompt}")
print(f"Size: {args.width}x{args.height}")
print(f"Frames: {args.num_frames}")
print(f"Guidance Scale: {args.guidance_scale}")
if extra_params:
print(f"Extra Params: {json.dumps(extra_params, indent=2)}")
print()
# Create video job
print("Creating video generation job...")
job = create_video_job(
api_url=args.api_url,
prompt=args.prompt,
model=args.model,
width=args.width,
height=args.height,
num_frames=args.num_frames,
guidance_scale=args.guidance_scale,
seed=args.seed,
extra_params=extra_params,
input_image=args.input_image,
)
video_id = job["id"]
print(f"Video job created: {video_id}")
print()
# Poll for completion
result = poll_video_status(args.api_url, video_id)
print()
print("=" * 50)
print("Generation Complete")
print("=" * 50)
print(f"Video ID: {result['id']}")
print(f"File: {result.get('file_name', 'N/A')}")
print(f"Inference Time: {result.get('inference_time_s', 'N/A')}s")
print()
if __name__ == "__main__":
main()
run_helios_distilled.sh
#!/bin/bash
# SPDX-License-Identifier: Apache-2.0
# SPDX-FileCopyrightText: Copyright contributors to the vLLM project
# Helios-Distilled with DMD Example
# Fast generation with Distribution Matching Distillation
API_URL="${API_URL:-http://localhost:8000}"
MODEL="${MODEL:-BestWishYsh/Helios-Distilled}"
PROMPT="${PROMPT:-A serene lakeside sunrise with mist over the water.}"
echo "==================================="
echo "Helios-Distilled with DMD"
echo "==================================="
echo "API URL: $API_URL"
echo "Model: $MODEL"
echo "Prompt: $PROMPT"
echo ""
# Model-specific parameters for Helios-Distilled
extra_params='{
"is_enable_stage2": true,
"pyramid_num_stages": 3,
"pyramid_num_inference_steps_list": [2, 2, 2],
"is_amplify_first_chunk": true
}'
echo "Model extra params: $extra_params"
echo ""
# Create video generation job
echo "Creating video generation job..."
RESPONSE=$(curl -s -X POST "$API_URL/v1/videos" \
-F "prompt=$PROMPT" \
-F "model=$MODEL" \
-F "width=640" \
-F "height=384" \
-F "guidance_scale=1.0" \
-F "seed=42" \
-F "extra_params=$extra_params")
echo "Response: $RESPONSE"
echo ""
# Extract video ID
VIDEO_ID=$(echo "$RESPONSE" | grep -o '"id":"[^"]*"' | cut -d'"' -f4)
if [ -z "$VIDEO_ID" ]; then
echo "Error: Failed to create video job"
exit 1
fi
echo "Video job created: $VIDEO_ID"
echo ""
# Poll for completion
echo "Polling for completion..."
while true; do
STATUS_RESPONSE=$(curl -s "$API_URL/v1/videos/$VIDEO_ID")
STATUS=$(echo "$STATUS_RESPONSE" | grep -o '"status":"[^"]*"' | cut -d'"' -f4)
PROGRESS=$(echo "$STATUS_RESPONSE" | grep -o '"progress":[0-9]*' | cut -d':' -f2)
echo "Status: $STATUS, Progress: $PROGRESS%"
if [ "$STATUS" = "completed" ]; then
echo ""
echo "Video generation completed!"
echo "Full response:"
echo "$STATUS_RESPONSE" | jq '.'
break
elif [ "$STATUS" = "failed" ]; then
echo ""
echo "Video generation failed!"
echo "$STATUS_RESPONSE" | jq '.'
exit 1
fi
sleep 2
done
run_helios_mid_stage2.sh
#!/bin/bash
# SPDX-License-Identifier: Apache-2.0
# SPDX-FileCopyrightText: Copyright contributors to the vLLM project
# Helios-Mid with Stage 2 + CFG-Zero* Example
# This demonstrates advanced generation with pyramid multi-stage denoising
API_URL="${API_URL:-http://localhost:8000}"
MODEL="${MODEL:-BestWishYsh/Helios-Mid}"
PROMPT="${PROMPT:-A serene lakeside sunrise with mist over the water.}"
echo "==================================="
echo "Helios-Mid Stage 2 + CFG-Zero*"
echo "==================================="
echo "API URL: $API_URL"
echo "Model: $MODEL"
echo "Prompt: $PROMPT"
echo ""
# Model-specific parameters for Helios-Mid
extra_params='{
"is_enable_stage2": true,
"pyramid_num_stages": 3,
"pyramid_num_inference_steps_list": [20, 20, 20],
"use_cfg_zero_star": true,
"use_zero_init": true,
"zero_steps": 1
}'
echo "Model extra params: $extra_params"
echo ""
# Create video generation job
echo "Creating video generation job..."
RESPONSE=$(curl -s -X POST "$API_URL/v1/videos" \
-F "prompt=$PROMPT" \
-F "model=$MODEL" \
-F "width=640" \
-F "height=384" \
-F "guidance_scale=5.0" \
-F "seed=42" \
-F "extra_params=$extra_params")
echo "Response: $RESPONSE"
echo ""
# Extract video ID
VIDEO_ID=$(echo "$RESPONSE" | grep -o '"id":"[^"]*"' | cut -d'"' -f4)
if [ -z "$VIDEO_ID" ]; then
echo "Error: Failed to create video job"
exit 1
fi
echo "Video job created: $VIDEO_ID"
echo ""
# Poll for completion
echo "Polling for completion..."
while true; do
STATUS_RESPONSE=$(curl -s "$API_URL/v1/videos/$VIDEO_ID")
STATUS=$(echo "$STATUS_RESPONSE" | grep -o '"status":"[^"]*"' | cut -d'"' -f4)
PROGRESS=$(echo "$STATUS_RESPONSE" | grep -o '"progress":[0-9]*' | cut -d':' -f2)
echo "Status: $STATUS, Progress: $PROGRESS%"
if [ "$STATUS" = "completed" ]; then
echo ""
echo "Video generation completed!"
echo "Full response:"
echo "$STATUS_RESPONSE" | jq '.'
break
elif [ "$STATUS" = "failed" ]; then
echo ""
echo "Video generation failed!"
echo "$STATUS_RESPONSE" | jq '.'
exit 1
fi
sleep 2
done
run_helios_t2v.sh
#!/bin/bash
# SPDX-License-Identifier: Apache-2.0
# SPDX-FileCopyrightText: Copyright contributors to the vLLM project
# Helios Text-to-Video Generation Example
# This script demonstrates how to use the /v1/videos API with Helios models
API_URL="${API_URL:-http://localhost:8000}"
MODEL="${MODEL:-BestWishYsh/Helios-Base}"
PROMPT="${PROMPT:-A serene lakeside sunrise with mist over the water.}"
echo "==================================="
echo "Helios T2V Generation"
echo "==================================="
echo "API URL: $API_URL"
echo "Model: $MODEL"
echo "Prompt: $PROMPT"
echo ""
# Create video generation job
echo "Creating video generation job..."
RESPONSE=$(curl -s -X POST "$API_URL/v1/videos" \
-F "prompt=$PROMPT" \
-F "model=$MODEL" \
-F "width=640" \
-F "height=384" \
-F "num_frames=99" \
-F "num_inference_steps=50" \
-F "guidance_scale=5.0" \
-F "seed=42")
echo "Response: $RESPONSE"
echo ""
# Extract video ID
VIDEO_ID=$(echo "$RESPONSE" | grep -o '"id":"[^"]*"' | cut -d'"' -f4)
if [ -z "$VIDEO_ID" ]; then
echo "Error: Failed to create video job"
exit 1
fi
echo "Video job created: $VIDEO_ID"
echo ""
# Poll for completion
echo "Polling for completion..."
while true; do
STATUS_RESPONSE=$(curl -s "$API_URL/v1/videos/$VIDEO_ID")
STATUS=$(echo "$STATUS_RESPONSE" | grep -o '"status":"[^"]*"' | cut -d'"' -f4)
PROGRESS=$(echo "$STATUS_RESPONSE" | grep -o '"progress":[0-9]*' | cut -d':' -f2)
echo "Status: $STATUS, Progress: $PROGRESS%"
if [ "$STATUS" = "completed" ]; then
echo ""
echo "Video generation completed!"
echo "Full response:"
echo "$STATUS_RESPONSE" | jq '.'
break
elif [ "$STATUS" = "failed" ]; then
echo ""
echo "Video generation failed!"
echo "$STATUS_RESPONSE" | jq '.'
exit 1
fi
sleep 2
done