vllm_omni.model_executor.models.utils ¶

add_prefix_to_loaded_weights ¶

add_prefix_to_loaded_weights(
    weights: set[str], prefix: str
) -> set[str]

Add a prefix to the names of the loaded weights.

reinit_rotary_inv_freq ¶

reinit_rotary_inv_freq(
    model: Module,
    base: float = 10000.0,
    match: Callable[[str, Module], bool] | None = None,
) -> int

Recompute inv_freq buffers on RoPE modules in-place.

Custom RoPE classes loaded via trust_remote_code that register inv_freq with persistent=False and are not in ROPE_INIT_FUNCTIONS come out of from_pretrained with garbage buffer values (shape and dtype correct, contents not). cos() / sin() of those values produce NaN, so the first forward emits NaN logits. Mainstream HF RoPE classes avoid this via _rope_init_function framework integration.

Recomputes 1.0 / base^(arange(0, head_dim, 2) / head_dim). head_dim is inferred from 2 * inv_freq.numel(). Pass match to override the default selector (modules whose qualified name ends in "rotary_emb" and that expose a 1-D float inv_freq tensor). Returns the number of buffers re-initialised.

safe_tensor_reshape ¶

safe_tensor_reshape(tensor: Tensor, shape: tuple) -> Tensor

Reshape a tensor safely.

split_list_into_ranges ¶

split_list_into_ranges(
    lst: Tensor, interval: int
) -> list[list[int]]

transformers_keys_to_ignore_compat ¶

transformers_keys_to_ignore_compat()

Make trust_remote_code weight loading robust to the transformers 5.9 _keys_to_ignore_on_load_unexpected list-vs-set change.

transformers 5.9 rewrote PreTrainedModel._adjust_missing_and_unexpected_keys from (attr or []) + patterns (list concatenation) to (attr or set()) | patterns (set union). Remote-code models such as OpenMOSS-Team/MOSS-TTS-Nano still declare _keys_to_ignore_on_load_unexpected as a list, so list | set raises TypeError: unsupported operand type(s) for |: 'list' and 'set' and the engine core dies during model load.

Wrap any from_pretrained(..., trust_remote_code=True) call whose remote code may declare the attribute as a list. The guard keeps such models loadable regardless of which transformers version is installed, while preserving the model's ignore patterns.