Skip to content

vllm_omni.diffusion.models.sensenova_u1

Modules:

Name Description
pipeline_sensenova_u1

SenseNova-U1 Pipeline for vLLM-Omni.

sensenova_u1_transformer

Qwen3 LLM with Mixture-of-Tokenizers (MoT) for SenseNova-U1.

SenseNovaU1Pipeline

Bases: Module, SupportsComponentDiscovery, DiffusionPipelineProfilerMixin

SenseNova-U1 text-to-image and image-to-image pipeline for vllm-omni.

Builds the full model graph internally: - language_model: SenseNovaU1ForCausalLM (TP-aware) - vision_model: NEOVisionModel (understanding branch) - fm_modules: ModuleDict with vision_model_mot_gen, timestep_embedder, fm_head, etc.

img2img (image editing) is triggered when multi_modal_data["image"] is present in the prompt dict. The pipeline then uses triple KV caches (condition / img_condition / uncondition) with dual CFG (cfg_scale + img_cfg_scale).

EXTRA_BODY_PARAMS class-attribute

EXTRA_BODY_PARAMS: frozenset[str] = frozenset(
    {
        "think",
        "cfg_scale",
        "cfg_norm",
        "timestep_shift",
        "t_eps",
        "img_cfg_scale",
        "max_tokens",
    }
)

EXTRA_OUTPUT_PARAMS class-attribute

EXTRA_OUTPUT_PARAMS: frozenset[str] = frozenset(
    {"think_text"}
)

device instance-attribute

device = get_local_device()

downsample_ratio instance-attribute

downsample_ratio = downsample_ratio

fm_modules instance-attribute

fm_modules = ModuleDict(
    {
        "vision_model_mot_gen": vision_model_mot_gen,
        "timestep_embedder": timestep_embedder,
        "fm_head": fm_head,
    }
)

img_context_token_id instance-attribute

img_context_token_id = convert_tokens_to_ids(
    IMG_CONTEXT_TOKEN
)

img_start_token_id instance-attribute

img_start_token_id = convert_tokens_to_ids(IMG_START_TOKEN)

language_model instance-attribute

language_model = SenseNovaU1ForCausalLM(
    llm_cfg, prefix="language_model"
)

local_model_path instance-attribute

local_model_path = _resolve_model_path(model_path)

merge_size instance-attribute

merge_size = merge_size

od_config instance-attribute

od_config = od_config

patch_size instance-attribute

patch_size = patch_size

support_image_input class-attribute instance-attribute

support_image_input = True

tokenizer instance-attribute

tokenizer = from_pretrained(local_model_path)

vision_model instance-attribute

vision_model = NEOVisionModel(vis_cfg)

weights_sources instance-attribute

weights_sources = [
    ComponentSource(
        model_or_path=local_model_path,
        subfolder=None,
        revision=revision,
        prefix="",
        fall_back_to_pt=False,
    )
]

forward

load_weights

load_weights(
    weights: Iterable[tuple[str, Tensor]],
) -> set[str]

get_sensenova_u1_post_process_func

get_sensenova_u1_post_process_func(
    od_config: OmniDiffusionConfig,
)