Helios Video Generation via Online API¶

Source https://github.com/vllm-project/vllm-omni/tree/main/examples/online_serving/helios.

This example demonstrates how to use the /v1/videos API with Helios models using the generic extra_params field.

Overview¶

The /v1/videos API now supports model-specific parameters through the extra_params field, which accepts a JSON object containing any model-specific configuration. This allows supporting new models like Helios without modifying the API.

Helios Model Variants¶

Helios-Base: Basic T2V/I2V/V2V generation (Stage 1 only)
Helios-Mid: Advanced generation with CFG-Zero* (Stage 2)
Helios-Distilled: Fast generation with DMD (Stage 2)

API Usage Examples¶

1. T2V (Text-to-Video) - Helios-Base¶

Basic text-to-video generation:

curl -X POST http://localhost:8000/v1/videos \
  -F "prompt=A serene lakeside sunrise with mist over the water." \
  -F "model=BestWishYsh/Helios-Base" \
  -F "width=640" \
  -F "height=384" \
  -F "num_frames=99" \
  -F "num_inference_steps=50" \
  -F "guidance_scale=5.0" \
  -F "seed=42"

2. I2V (Image-to-Video) - Helios-Base¶

Generate video from an input image:

curl -X POST http://localhost:8000/v1/videos \
  -F "prompt=The lake water gently ripples as morning mist rises." \
  -F "model=BestWishYsh/Helios-Base" \
  -F "input_reference=@/path/to/image.jpg" \
  -F "width=640" \
  -F "height=384" \
  -F "num_frames=99" \
  -F "guidance_scale=5.0"

3. Helios-Mid with Stage 2 + CFG-Zero*¶

Advanced generation with pyramid multi-stage denoising:

curl -X POST http://localhost:8000/v1/videos \
  -F "prompt=A serene lakeside sunrise with mist over the water." \
  -F "model=BestWishYsh/Helios-Mid" \
  -F "width=640" \
  -F "height=384" \
  -F "guidance_scale=5.0" \
  -F 'extra_params={
    "is_enable_stage2": true,
    "pyramid_num_stages": 3,
    "pyramid_num_inference_steps_list": [20, 20, 20],
    "use_cfg_zero_star": true,
    "use_zero_init": true,
    "zero_steps": 1
  }'

4. Helios-Distilled with DMD¶

Fast generation with Distribution Matching Distillation:

curl -X POST http://localhost:8000/v1/videos \
  -F "prompt=A serene lakeside sunrise with mist over the water." \
  -F "model=BestWishYsh/Helios-Distilled" \
  -F "width=640" \
  -F "height=384" \
  -F "guidance_scale=1.0" \
  -F 'extra_params={
    "is_enable_stage2": true,
    "pyramid_num_stages": 3,
    "pyramid_num_inference_steps_list": [2, 2, 2],
    "is_amplify_first_chunk": true
  }'

Model-Specific Parameters¶

The extra_params field accepts a JSON object with model-specific parameters. For Helios models, supported parameters include:

Stage 2 Parameters¶

is_enable_stage2 (bool): Enable pyramid multi-stage denoising
pyramid_num_stages (int): Number of pyramid stages (default: 3)
pyramid_num_inference_steps_list (array): Steps per stage, e.g., [20, 20, 20]

CFG Zero Star Parameters (Helios-Mid)¶

use_cfg_zero_star (bool): Enable CFG Zero Star guidance
use_zero_init (bool): Use zero initialization for first steps
zero_steps (int): Number of initial zero prediction steps (default: 1)

DMD Parameters (Helios-Distilled)¶

is_amplify_first_chunk (bool): Enable DMD amplification for first chunk

Video Input (V2V mode)¶

For video-to-video generation, upload a video file via input_reference:

curl -X POST http://localhost:8000/v1/videos \
  -F "prompt=Transform the video into a watercolor painting style." \
  -F "model=BestWishYsh/Helios-Base" \
  -F "input_reference=@/path/to/video.mp4" \
  -F "guidance_scale=5.0"

Python Example¶

import requests
import json

url = "http://localhost:8000/v1/videos"

# Helios-Mid with Stage 2
data = {
    "prompt": "A serene lakeside sunrise with mist over the water.",
    "model": "BestWishYsh/Helios-Mid",
    "width": 640,
    "height": 384,
    "guidance_scale": 5.0,
    "extra_params": json.dumps({
        "is_enable_stage2": True,
        "pyramid_num_stages": 3,
        "pyramid_num_inference_steps_list": [20, 20, 20],
        "use_cfg_zero_star": True,
        "use_zero_init": True,
        "zero_steps": 1
    })
}

response = requests.post(url, data=data)
video_job = response.json()
print(f"Video job created: {video_job['id']}")

# Poll for completion
import time
while True:
    status_response = requests.get(f"{url}/{video_job['id']}")
    status = status_response.json()

    if status['status'] == 'completed':
        print(f"Video generated: {status['file_name']}")
        break
    elif status['status'] == 'failed':
        print(f"Generation failed: {status.get('error')}")
        break

    print(f"Progress: {status['progress']}%")
    time.sleep(2)

Benefits of This Approach¶

No API Changes: New models can be supported without modifying the API
Backward Compatible: Existing clients continue to work
Flexible: Any model-specific parameter can be passed
Type-Safe: Parameters are validated at the model level
Future-Proof: Supports models that don't exist yet

Notes¶

The extra_params field must be a valid JSON object
Parameters are passed directly to the model's extra_args
Invalid parameters will be caught by the model implementation
Tensor inputs (images/videos) should use input_reference, not extra_params

Example materials¶

helios_client.py

#!/usr/bin/env python3
# SPDX-License-Identifier: Apache-2.0
# SPDX-FileCopyrightText: Copyright contributors to the vLLM project

"""
Helios video generation client example using the /v1/videos API.

This example demonstrates how to use the generic extra_params field
to pass Helios-specific parameters without modifying the API.
"""

import argparse
import json
import time
from pathlib import Path

import requests


def create_video_job(
    api_url: str,
    prompt: str,
    model: str,
    width: int = 640,
    height: int = 384,
    num_frames: int = 99,
    guidance_scale: float = 5.0,
    seed: int = 42,
    extra_params: dict | None = None,
    input_image: Path | None = None,
) -> dict:
    """Create a video generation job."""
    data = {
        "prompt": prompt,
        "model": model,
        "width": width,
        "height": height,
        "num_frames": num_frames,
        "guidance_scale": guidance_scale,
        "seed": seed,
    }

    files = {}
    if input_image:
        files["input_reference"] = open(input_image, "rb")

    if extra_params:
        data["extra_params"] = json.dumps(extra_params)

    response = requests.post(f"{api_url}/v1/videos", data=data, files=files)
    response.raise_for_status()

    if files:
        files["input_reference"].close()

    return response.json()


def poll_video_status(api_url: str, video_id: str, poll_interval: int = 2) -> dict:
    """Poll video generation status until completion."""
    print(f"Polling video job: {video_id}")

    while True:
        response = requests.get(f"{api_url}/v1/videos/{video_id}")
        response.raise_for_status()
        status_data = response.json()

        status = status_data["status"]
        progress = status_data.get("progress", 0)

        print(f"Status: {status}, Progress: {progress}%")

        if status == "completed":
            print("Video generation completed!")
            return status_data
        elif status == "failed":
            error = status_data.get("error", {})
            raise RuntimeError(f"Video generation failed: {error}")

        time.sleep(poll_interval)


def main():
    parser = argparse.ArgumentParser(description="Helios video generation client")
    parser.add_argument(
        "--api-url",
        default="http://localhost:8000",
        help="API server URL",
    )
    parser.add_argument(
        "--model",
        default="BestWishYsh/Helios-Base",
        help="Model name (Helios-Base, Helios-Mid, Helios-Distilled)",
    )
    parser.add_argument(
        "--prompt",
        default="A serene lakeside sunrise with mist over the water.",
        help="Text prompt",
    )
    parser.add_argument(
        "--width",
        type=int,
        default=640,
        help="Video width",
    )
    parser.add_argument(
        "--height",
        type=int,
        default=384,
        help="Video height",
    )
    parser.add_argument(
        "--num-frames",
        type=int,
        default=99,
        help="Number of frames",
    )
    parser.add_argument(
        "--guidance-scale",
        type=float,
        default=5.0,
        help="Guidance scale",
    )
    parser.add_argument(
        "--seed",
        type=int,
        default=42,
        help="Random seed",
    )
    parser.add_argument(
        "--input-image",
        type=Path,
        help="Input image for I2V mode",
    )
    parser.add_argument(
        "--preset",
        choices=["base", "mid-stage2", "distilled"],
        default="base",
        help="Helios preset configuration",
    )

    args = parser.parse_args()

    # Define preset configurations
    presets = {
        "base": None,  # No extra params for base model
        "mid-stage2": {
            "is_enable_stage2": True,
            "pyramid_num_stages": 3,
            "pyramid_num_inference_steps_list": [20, 20, 20],
            "use_cfg_zero_star": True,
            "use_zero_init": True,
            "zero_steps": 1,
        },
        "distilled": {
            "is_enable_stage2": True,
            "pyramid_num_stages": 3,
            "pyramid_num_inference_steps_list": [2, 2, 2],
            "is_amplify_first_chunk": True,
        },
    }

    extra_params = presets[args.preset]

    print("=" * 50)
    print("Helios Video Generation")
    print("=" * 50)
    print(f"API URL: {args.api_url}")
    print(f"Model: {args.model}")
    print(f"Preset: {args.preset}")
    print(f"Prompt: {args.prompt}")
    print(f"Size: {args.width}x{args.height}")
    print(f"Frames: {args.num_frames}")
    print(f"Guidance Scale: {args.guidance_scale}")
    if extra_params:
        print(f"Extra Params: {json.dumps(extra_params, indent=2)}")
    print()

    # Create video job
    print("Creating video generation job...")
    job = create_video_job(
        api_url=args.api_url,
        prompt=args.prompt,
        model=args.model,
        width=args.width,
        height=args.height,
        num_frames=args.num_frames,
        guidance_scale=args.guidance_scale,
        seed=args.seed,
        extra_params=extra_params,
        input_image=args.input_image,
    )

    video_id = job["id"]
    print(f"Video job created: {video_id}")
    print()

    # Poll for completion
    result = poll_video_status(args.api_url, video_id)

    print()
    print("=" * 50)
    print("Generation Complete")
    print("=" * 50)
    print(f"Video ID: {result['id']}")
    print(f"File: {result.get('file_name', 'N/A')}")
    print(f"Inference Time: {result.get('inference_time_s', 'N/A')}s")
    print()


if __name__ == "__main__":
    main()

run_helios_distilled.sh

#!/bin/bash
# SPDX-License-Identifier: Apache-2.0
# SPDX-FileCopyrightText: Copyright contributors to the vLLM project

# Helios-Distilled with DMD Example
# Fast generation with Distribution Matching Distillation

API_URL="${API_URL:-http://localhost:8000}"
MODEL="${MODEL:-BestWishYsh/Helios-Distilled}"
PROMPT="${PROMPT:-A serene lakeside sunrise with mist over the water.}"

echo "==================================="
echo "Helios-Distilled with DMD"
echo "==================================="
echo "API URL: $API_URL"
echo "Model: $MODEL"
echo "Prompt: $PROMPT"
echo ""

# Model-specific parameters for Helios-Distilled
extra_params='{
  "is_enable_stage2": true,
  "pyramid_num_stages": 3,
  "pyramid_num_inference_steps_list": [2, 2, 2],
  "is_amplify_first_chunk": true
}'

echo "Model extra params: $extra_params"
echo ""

# Create video generation job
echo "Creating video generation job..."
RESPONSE=$(curl -s -X POST "$API_URL/v1/videos" \
  -F "prompt=$PROMPT" \
  -F "model=$MODEL" \
  -F "width=640" \
  -F "height=384" \
  -F "guidance_scale=1.0" \
  -F "seed=42" \
  -F "extra_params=$extra_params")

echo "Response: $RESPONSE"
echo ""

# Extract video ID
VIDEO_ID=$(echo "$RESPONSE" | grep -o '"id":"[^"]*"' | cut -d'"' -f4)

if [ -z "$VIDEO_ID" ]; then
  echo "Error: Failed to create video job"
  exit 1
fi

echo "Video job created: $VIDEO_ID"
echo ""

# Poll for completion
echo "Polling for completion..."
while true; do
  STATUS_RESPONSE=$(curl -s "$API_URL/v1/videos/$VIDEO_ID")
  STATUS=$(echo "$STATUS_RESPONSE" | grep -o '"status":"[^"]*"' | cut -d'"' -f4)
  PROGRESS=$(echo "$STATUS_RESPONSE" | grep -o '"progress":[0-9]*' | cut -d':' -f2)

  echo "Status: $STATUS, Progress: $PROGRESS%"

  if [ "$STATUS" = "completed" ]; then
    echo ""
    echo "Video generation completed!"
    echo "Full response:"
    echo "$STATUS_RESPONSE" | jq '.'
    break
  elif [ "$STATUS" = "failed" ]; then
    echo ""
    echo "Video generation failed!"
    echo "$STATUS_RESPONSE" | jq '.'
    exit 1
  fi

  sleep 2
done

run_helios_mid_stage2.sh

#!/bin/bash
# SPDX-License-Identifier: Apache-2.0
# SPDX-FileCopyrightText: Copyright contributors to the vLLM project

# Helios-Mid with Stage 2 + CFG-Zero* Example
# This demonstrates advanced generation with pyramid multi-stage denoising

API_URL="${API_URL:-http://localhost:8000}"
MODEL="${MODEL:-BestWishYsh/Helios-Mid}"
PROMPT="${PROMPT:-A serene lakeside sunrise with mist over the water.}"

echo "==================================="
echo "Helios-Mid Stage 2 + CFG-Zero*"
echo "==================================="
echo "API URL: $API_URL"
echo "Model: $MODEL"
echo "Prompt: $PROMPT"
echo ""

# Model-specific parameters for Helios-Mid
extra_params='{
  "is_enable_stage2": true,
  "pyramid_num_stages": 3,
  "pyramid_num_inference_steps_list": [20, 20, 20],
  "use_cfg_zero_star": true,
  "use_zero_init": true,
  "zero_steps": 1
}'

echo "Model extra params: $extra_params"
echo ""

# Create video generation job
echo "Creating video generation job..."
RESPONSE=$(curl -s -X POST "$API_URL/v1/videos" \
  -F "prompt=$PROMPT" \
  -F "model=$MODEL" \
  -F "width=640" \
  -F "height=384" \
  -F "guidance_scale=5.0" \
  -F "seed=42" \
  -F "extra_params=$extra_params")

echo "Response: $RESPONSE"
echo ""

# Extract video ID
VIDEO_ID=$(echo "$RESPONSE" | grep -o '"id":"[^"]*"' | cut -d'"' -f4)

if [ -z "$VIDEO_ID" ]; then
  echo "Error: Failed to create video job"
  exit 1
fi

echo "Video job created: $VIDEO_ID"
echo ""

# Poll for completion
echo "Polling for completion..."
while true; do
  STATUS_RESPONSE=$(curl -s "$API_URL/v1/videos/$VIDEO_ID")
  STATUS=$(echo "$STATUS_RESPONSE" | grep -o '"status":"[^"]*"' | cut -d'"' -f4)
  PROGRESS=$(echo "$STATUS_RESPONSE" | grep -o '"progress":[0-9]*' | cut -d':' -f2)

  echo "Status: $STATUS, Progress: $PROGRESS%"

  if [ "$STATUS" = "completed" ]; then
    echo ""
    echo "Video generation completed!"
    echo "Full response:"
    echo "$STATUS_RESPONSE" | jq '.'
    break
  elif [ "$STATUS" = "failed" ]; then
    echo ""
    echo "Video generation failed!"
    echo "$STATUS_RESPONSE" | jq '.'
    exit 1
  fi

  sleep 2
done

run_helios_t2v.sh

#!/bin/bash
# SPDX-License-Identifier: Apache-2.0
# SPDX-FileCopyrightText: Copyright contributors to the vLLM project

# Helios Text-to-Video Generation Example
# This script demonstrates how to use the /v1/videos API with Helios models

API_URL="${API_URL:-http://localhost:8000}"
MODEL="${MODEL:-BestWishYsh/Helios-Base}"
PROMPT="${PROMPT:-A serene lakeside sunrise with mist over the water.}"

echo "==================================="
echo "Helios T2V Generation"
echo "==================================="
echo "API URL: $API_URL"
echo "Model: $MODEL"
echo "Prompt: $PROMPT"
echo ""

# Create video generation job
echo "Creating video generation job..."
RESPONSE=$(curl -s -X POST "$API_URL/v1/videos" \
  -F "prompt=$PROMPT" \
  -F "model=$MODEL" \
  -F "width=640" \
  -F "height=384" \
  -F "num_frames=99" \
  -F "num_inference_steps=50" \
  -F "guidance_scale=5.0" \
  -F "seed=42")

echo "Response: $RESPONSE"
echo ""

# Extract video ID
VIDEO_ID=$(echo "$RESPONSE" | grep -o '"id":"[^"]*"' | cut -d'"' -f4)

if [ -z "$VIDEO_ID" ]; then
  echo "Error: Failed to create video job"
  exit 1
fi

echo "Video job created: $VIDEO_ID"
echo ""

# Poll for completion
echo "Polling for completion..."
while true; do
  STATUS_RESPONSE=$(curl -s "$API_URL/v1/videos/$VIDEO_ID")
  STATUS=$(echo "$STATUS_RESPONSE" | grep -o '"status":"[^"]*"' | cut -d'"' -f4)
  PROGRESS=$(echo "$STATUS_RESPONSE" | grep -o '"progress":[0-9]*' | cut -d':' -f2)

  echo "Status: $STATUS, Progress: $PROGRESS%"

  if [ "$STATUS" = "completed" ]; then
    echo ""
    echo "Video generation completed!"
    echo "Full response:"
    echo "$STATUS_RESPONSE" | jq '.'
    break
  elif [ "$STATUS" = "failed" ]; then
    echo ""
    echo "Video generation failed!"
    echo "$STATUS_RESPONSE" | jq '.'
    exit 1
  fi

  sleep 2
done