vllm_omni.diffusion.models.internvla_a1 ¶
InternVLA-A1 diffusion model components.
Modules:
| Name | Description |
|---|---|
adapter_qwen3_vl | |
config | |
cosmos_ci_torch | |
model_cosmos | |
model_internvla_a1 | |
pipeline_internvla_a1 | |
InternVLAA1 ¶
Bases: Module
cosmos instance-attribute ¶
cosmos = ImageTokenizer(
checkpoint_enc=str(cosmos_encoder_path),
checkpoint_dec=str(cosmos_decoder_path),
device=device,
)
cosmos_in_proj instance-attribute ¶
downsample_conv instance-attribute ¶
qwen3_vl_with_expert instance-attribute ¶
qwen3_vl_with_expert = Qwen3VLWithExpertModel(
vlm_config, action_expert_config, precision=dtype
)
upsample_conv instance-attribute ¶
denoise_step ¶
denoise_step(
state: Tensor,
prefix_pad_masks: Tensor,
past_key_values: Any,
max_prefix_position_ids: Tensor,
x_t: Tensor,
timestep: Tensor,
) -> Tensor
denoise_step_optimized ¶
denoise_step_optimized(
suffix_static: SuffixStaticContext,
past_key_values: Any,
x_t: Tensor,
timestep: Tensor,
) -> Tensor
embed_prefix ¶
embed_prefix(
pixel_values: Tensor,
image_grid_thw: Tensor,
lang_tokens: Tensor,
lang_masks: Tensor,
) -> tuple[Tensor, Tensor, Tensor]
embed_suffix ¶
embed_suffix(
state: Tensor, noisy_actions: Tensor, timestep: Tensor
) -> tuple[Tensor, Tensor, Tensor]
get_position_ids ¶
get_position_ids(
lang_tokens: Tensor,
image_grid_thw: Tensor | None,
pad_masks: Tensor,
) -> tuple[Tensor, Any]
prepare_suffix_static_context ¶
prepare_suffix_static_context(
state: Tensor,
prefix_pad_masks: Tensor,
max_prefix_position_ids: Tensor,
) -> SuffixStaticContext
InternVLAA1Config dataclass ¶
Standalone-compatible InternVLA-A1 config with a few fake-smoke defaults.
enable_suffix_static_context_optimization class-attribute instance-attribute ¶
enable_suffix_static_context_optimization: bool = False
image_resolution class-attribute instance-attribute ¶
input_features class-attribute instance-attribute ¶
output_features class-attribute instance-attribute ¶
from_model_config classmethod ¶
from_model_config(
model_config: dict[str, Any] | None,
) -> InternVLAA1Config
InternVLAA1Pipeline ¶
Bases: Module, DiffusionPipelineProfilerMixin
InternVLA-A1 pipeline wrapper for the policy implementation.
enable_warmup instance-attribute ¶
enable_warmup = (
bool(enable_warmup)
if isinstance(enable_warmup, bool)
else False
)
processor_model_name instance-attribute ¶
processor_model_name = str(
get("processor_model_name", DEFAULT_QWEN3_VL_MODEL)
)
InternVLAA1Policy ¶
Bases: Module
input_builder instance-attribute ¶
input_builder = Qwen3VLInputBuilder(
processor_model_name=processor_model_name,
max_length=tokenizer_max_length,
)
model instance-attribute ¶
model = InternVLAA1(
config,
cosmos_encoder_path=cosmos_encoder_path,
cosmos_decoder_path=cosmos_decoder_path,
)
forward ¶
forward(
batch: dict[str, Any],
*,
noise: Tensor | None = None,
decode_image: bool = False,
) -> tuple[Tensor, Tensor | None]
from_pretrained classmethod ¶
from_pretrained(
checkpoint_dir: str | Path,
*,
config: InternVLAA1Config | None = None,
processor_model_name: str = DEFAULT_QWEN3_VL_MODEL,
strict: bool = False,
) -> InternVLAA1Policy
InternVLAA1TrainMetadata dataclass ¶
processor_model_name class-attribute instance-attribute ¶
processor_model_name: str = DEFAULT_QWEN3_VL_MODEL
from_pretrained classmethod ¶
from_pretrained(
checkpoint_dir: str | Path,
) -> InternVLAA1TrainMetadata
get_internvla_a1_post_process_func ¶
get_internvla_a1_post_process_func(
od_config: OmniDiffusionConfig,
)