vllm_omni.transformers_utils.configs.voxcpm2 ¶
VoxCPM2Config ¶
Bases: PretrainedConfig
Configuration for VoxCPM2 native AR integration.
The HuggingFace checkpoint stores LM parameters inside a nested lm_config dict. This class hoists them to top-level attributes so that vllm's MiniCPMModel can consume them directly.
vllm's MiniCPM always applies muP scaling (scale_emb, scale_depth, dim_model_base). VoxCPM2 was trained with use_mup=false, so we neutralise the scalings: * scale_emb = 1.0 * scale_depth = sqrt(num_hidden_layers) (cancels the division) * dim_model_base = hidden_size (makes scale_width = 1.0)
intermediate_size instance-attribute ¶
keys_to_ignore_at_inference class-attribute instance-attribute ¶
max_position_embeddings instance-attribute ¶
num_attention_heads instance-attribute ¶
num_hidden_layers instance-attribute ¶
num_key_value_heads instance-attribute ¶
scalar_quantization_latent_dim instance-attribute ¶
scalar_quantization_scale instance-attribute ¶
get_text_config ¶
Return self as the text config — LM attributes are top-level.