Skip to content

vllm_omni.model_executor.models.common.snake_activation

Shared Snake/SnakeBeta activations for speech decoders.

Used by: Qwen3-TTS, Qwen3-Omni Code2Wav, Qwen2.5-Omni, CoVo-Audio, CosyVoice3.

logger module-attribute

logger = init_logger(__name__)

Snake

Bases: SnakeBeta

Original Snake activation with a single parameter: x + 1/α * sin²(αx).

Unlike SnakeBeta which has separate alpha (frequency) and beta (magnitude) parameters, Snake uses alpha for both. Only alpha appears in the state_dict — beta is absent, keeping checkpoint compatibility with CosyVoice3's HiFi-GAN.

The Triton kernel and precomputed-cache path from SnakeBeta are reused; precompute_exp_cache derives _inv_beta from alpha so the forward path is identical.

precompute_exp_cache

precompute_exp_cache()

Derive both exp_alpha and inv_beta from the single alpha parameter.

SnakeBeta

Bases: Module

A modified Snake function which uses separate parameters for the magnitude of the periodic components Shape: - Input: (B, C, T) - Output: (B, C, T), same shape as the input Parameters: - alpha - trainable parameter that controls frequency - beta - trainable parameter that controls magnitude References: - This activation function is a modified version based on this paper by Liu Ziyin, Tilman Hartwig, Masahito Ueda: https://huggingface.co/papers/2006.08195

alpha instance-attribute

alpha = Parameter(zeros(in_features) * alpha)

alpha_logscale instance-attribute

alpha_logscale = alpha_logscale

beta instance-attribute

beta = Parameter(zeros(in_features) * alpha)

in_features instance-attribute

in_features = in_features

no_div_by_zero instance-attribute

no_div_by_zero = 1e-09

forward

forward(hidden_states)

SnakeBeta := x + 1/b * sin^2(x*a)

precompute_exp_cache

precompute_exp_cache()

Materialize exp(alpha) and 1/(exp(beta)+eps) as frozen buffers.