Skip to content

vllm_omni.transformers_utils.configs.higgs_audio_v3

Configuration class for higgs-audio v3 (HiggsMultimodalQwen3) in vllm-omni.

HiggsAudioV3Config.from_pretrained(model_path) returns a config with tts_token_id, text_token_id, audio_continuation_id, and eos_token_id already resolved from the checkpoint tokenizer. If the tokenizer is unavailable or missing required specials, the load raises.

logger module-attribute

logger = logging.getLogger(__name__)

HiggsAudioV3Config

Bases: PretrainedConfig

Typed config for higgs-audio v3 (HiggsMultimodalQwen3).

from_pretrained() automatically resolves <|tts|>, <|text|>, <|audio|> and eos_token_id from the checkpoint tokenizer.

audio_continuation_id instance-attribute

audio_continuation_id = audio_continuation_id

audio_encoder_config instance-attribute

audio_encoder_config = audio_encoder_config

audio_hidden_size instance-attribute

audio_hidden_size = int(
    audio_encoder_config.get(
        "out_dim", self.text_config.hidden_size
    )
)

audio_stream_bos_id instance-attribute

audio_stream_bos_id = audio_stream_bos_id

audio_stream_eos_id instance-attribute

audio_stream_eos_id = audio_stream_eos_id

audio_token_id instance-attribute

audio_token_id = audio_token_id

codebook_size instance-attribute

codebook_size = int(
    audio_encoder_config.get("vocab_size", codebook_size)
)

enable_flashinfer_api_unwrap instance-attribute

enable_flashinfer_api_unwrap = bool(
    enable_flashinfer_api_unwrap
)

enable_mlp_cudagraph instance-attribute

enable_mlp_cudagraph = bool(enable_mlp_cudagraph)

frame_rate instance-attribute

frame_rate = frame_rate

hidden_size property

hidden_size: int

is_composition class-attribute instance-attribute

is_composition = True

mel_per_sample instance-attribute

mel_per_sample = mel_per_sample

model_type class-attribute instance-attribute

model_type: str = 'higgs_multimodal_qwen3'

num_codebooks instance-attribute

num_codebooks = int(
    audio_encoder_config.get("num_codebooks", num_codebooks)
)

num_real_codes property

num_real_codes: int

sample_rate instance-attribute

sample_rate = sample_rate

text_config instance-attribute

text_config = _build_text_config(text_config)

text_token_id instance-attribute

text_token_id = text_token_id

tie_modality_embeddings instance-attribute

tie_modality_embeddings = bool(
    audio_encoder_config.get("tie_word_embeddings", True)
)

tts_token_id instance-attribute

tts_token_id = tts_token_id

from_pretrained classmethod

from_pretrained(
    pretrained_model_name_or_path: str, **kwargs: Any
) -> HiggsAudioV3Config

Load config and resolve special token IDs from the checkpoint tokenizer.

Passes the original pretrained_model_name_or_path (local dir or HF repo ID) directly to AutoTokenizer.from_pretrained() so it can handle cache hits, downloads, and local paths uniformly. Raises if the tokenizer is missing required specials.

get_text_config

get_text_config(decoder: bool = False) -> PretrainedConfig

resolve_special_tokens

resolve_special_tokens(model_path: str) -> None

Resolve <|tts|>, <|text|>, <|audio|> and eos from the HF tokenizer.

Raises ValueError if any of the 3 required specials is missing from the tokenizer's added vocabulary.