vllm_omni.diffusion.models.cosmos3.audio_tokenizer ¶
Modules:
| Name | Description |
|---|---|
avae | Diffusers-format AVAE audio tokenizer used by Cosmos3 sound generation. |
Cosmos3AVAEAudioTokenizer ¶
Bases: Module
Decoder-only AVAE tokenizer for Cosmos3 audio latents.
audio_channels instance-attribute ¶
audio_channels = int(
_config_get(
config,
"dec_out_channels",
"audio_channels",
default=2
if bool(get("stereo", audio_channels == 2))
else 1,
)
)
decoder instance-attribute ¶
decoder = OobleckDecoder(
channels=int(
_config_get(config, "dec_dim", default=320)
),
input_channels=latent_ch,
audio_channels=audio_channels,
upsampling_ratios=list(reversed(dec_strides)),
channel_multiples=list(
_config_get(
config, "dec_c_mults", default=[1, 2, 4, 8, 16]
)
),
)
hop_size instance-attribute ¶
hop_size = int(
_config_get(
config,
"hop_size",
default=prod(dec_strides)
if dec_strides
else hop_size,
)
)
latent_ch instance-attribute ¶
latent_ch = int(
_config_get(
config,
"vocoder_input_dim",
"io_channels",
"latent_ch",
default=io_channels,
)
)