vllm_omni.model_executor.layers.timestep_embedding ¶
Shared timestep embedding primitives for diffusion models.
SinusPositionEmbedding: sin/cos positional encoding (TTS DiT/CFM). Used by Qwen3-TTS, Qwen2.5-Omni, CosyVoice3, Ming-Flash-Omni.DiTTimestepEmbedding: SinusPosEmb + Linear + SiLU + Linear MLP. Used by the same models as above.timestep_embedding(): standalone function (GLIDE/DiT convention). Used by Bagel, NextStep, Z-Image, HunyuanImage3.
DiTTimestepEmbedding ¶
SinusPositionEmbedding ¶
Bases: Module
Sinusoidal position embedding for scalar timesteps.
Maps scalar timestep values to dim-dimensional embeddings using the standard log-spaced frequency formula from DDPM/DiT.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dim | int | Output embedding dimension (must be even). | required |
timestep_embedding ¶
Create sinusoidal timestep embeddings (GLIDE/DiT convention).
Produces cos-then-sin embeddings with log-spaced frequencies.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
t | Tensor | (N,) 1-D tensor of timestep indices (may be fractional). | required |
dim | int | Output embedding dimension. | required |
max_period | float | Controls the minimum frequency. | 10000.0 |
Returns:
| Type | Description |
|---|---|
Tensor | (N, dim) tensor of positional embeddings. |