GLM-TTS config registration with transformers AutoConfig.
Registers GLMTTSConfig (model_type="glm_tts") so that AutoConfig.from_pretrained("path/to/glm-tts") returns the correct config class.
Note: GLM-TTS uses a Llama backbone, but we register a custom config to handle the special token IDs and flow model parameters.
Bases: PretrainedConfig
Llama-based AR model for text-to-speech token generation.
Special token IDs are loaded dynamically from the tokenizer at init time.
audio_token_end = audio_token_end
audio_token_start = audio_token_start
boa_token_id = boa_token_id
eoa_token_id = eoa_token_id
hidden_size = hidden_size
input_frame_rate = input_frame_rate
intermediate_size = intermediate_size
max_position_embeddings = max_position_embeddings
max_token_text_ratio instance-attribute
max_token_text_ratio = max_token_text_ratio
mel_framerate = mel_framerate
min_token_text_ratio instance-attribute
min_token_text_ratio = min_token_text_ratio
model_type: str = 'glm_tts'
num_attention_heads = num_attention_heads
num_hidden_layers = num_hidden_layers
num_key_value_heads = num_key_value_heads
ras_win_size = ras_win_size
rms_norm_eps = rms_norm_eps
sample_method = sample_method
speech_token_dim = speech_token_dim
speech_token_vocab_size = speech_token_vocab_size
spk_embed_dim = spk_embed_dim