Skip to content

Stable Diffusion 3.5 Usage Guide

This guide provides instructions for running Stable-Diffusion3.5 text-to-image generation models using vLLM-Omni with Cache-DiT acceleration.

Supported Models

  • stabilityai/stable-diffusion-3.5-large: 8.1B parameters model
  • stabilityai/stable-diffusion-3.5-large-turbo: 8.1B parameters model (timestep-distilled enabling few-step inference)
  • stabilityai/stable-diffusion-3.5-medium: 2.5B parameters model

Installing vLLM-Omni

uv venv
source .venv/bin/activate
uv pip install vllm==0.12.0
uv pip install git+https://github.com/vllm-project/vllm-omni.git

The CLI examples below are from the vLLM-Omni repo. If you want to run them directly, clone that repo and run the scripts from its examples/offline_inference directory.

Text-to-Image Generation

Basic Usage

from vllm_omni.entrypoints.omni import Omni

omni = Omni(model="stabilityai/stable-diffusion-3.5-medium")

images = omni.generate(
    prompt="a cat wearing sunglasses, cyberpunk style",
    negative_prompt="blurry, low quality",
    height=1024,
    width=1024,
    num_inference_steps=28,
    guidance_scale=7.5,
    num_outputs_per_prompt=2,
)

CLI Usage

python examples/offline_inference/text_to_image/text_to_image.py \
  --model stabilityai/stable-diffusion-3.5-medium \
  --prompt "a cat wearing sunglasses, cyberpunk style" \
  --negative-prompt "blurry, low quality" \
  --height 1024 \
  --width 1024 \
  --num-inference-steps 28 \
  --guidance-scale 7.5

Cache-DiT Acceleration

vLLM-Omni supports Cache-DiT acceleration for stable-diffusion-3.5 models, which can significantly speed up image generation through caching mechanisms.

Enabling Cache-DiT

from vllm_omni.entrypoints.omni import Omni

omni = Omni(
    model="stabilityai/stable-diffusion-3.5-medium",
    cache_backend="cache_dit",
)

images = omni.generate(
    prompt="a cat wearing sunglasses, cyberpunk style",
    height=1024,
    width=1024,
    num_inference_steps=28,
)

Custom Cache-DiT Configuration

For fine-tuned control over the acceleration:

omni = Omni(
    model="stabilityai/stable-diffusion-3.5-medium",
    cache_backend="cache_dit",
    cache_config={
        "Fn_compute_blocks": 8,
        "Bn_compute_blocks": 0,
        "max_warmup_steps": 4,
        "residual_diff_threshold": 0.12,
    },
)

Key Parameters

Parameter Default Description
height 1024 image height (multiples of 16)
width 1024 image width (multiples of 16)
num_inference_steps 28 Denoising steps
guidance_scale 1.0 Classifier-free guidance scale

Additional Resources