Registry#
Module Contents#
- class vllm.multimodal.registry.ProcessingInfoFactory(*args, **kwargs)[source][source]#
Constructs a
MultiModalProcessor
instance from the context.
- class vllm.multimodal.registry.DummyInputsBuilderFactory(*args, **kwargs)[source][source]#
Constructs a
BaseDummyInputsBuilder
instance from the context.
- class vllm.multimodal.registry.MultiModalProcessorFactory(*args, **kwargs)[source][source]#
Constructs a
MultiModalProcessor
instance from the context.
- class vllm.multimodal.registry.MultiModalRegistry(*, plugins: Sequence[MultiModalPlugin] = DEFAULT_PLUGINS)[source][source]#
A registry that dispatches data processing according to the model.
- register_plugin(plugin: MultiModalPlugin) None [source][source]#
Register a multi-modal plugin so it can be recognized by vLLM.
- register_input_mapper(data_type_key: str, mapper: Callable[[InputContext, object | list[object]], MultiModalKwargs] | None = None)[source][source]#
Register an input mapper for a specific modality to a model class.
See
MultiModalPlugin.register_input_mapper()
for more details.
- register_image_input_mapper(mapper: Callable[[InputContext, object | list[object]], MultiModalKwargs] | None = None)[source][source]#
Register an input mapper for image data to a model class.
See
MultiModalPlugin.register_input_mapper()
for more details.
- map_input(model_config: ModelConfig, data: Mapping[str, Any | list[Any]], mm_processor_kwargs: Dict[str, Any] | None = None) MultiModalKwargs [source][source]#
Apply an input mapper to the data passed to the model.
The data belonging to each modality is passed to the corresponding plugin which in turn converts the data into into keyword arguments via the input mapper registered for that model.
See
MultiModalPlugin.map_input()
for more details.Note
This should be called after
init_mm_limits_per_prompt()
.
- create_input_mapper(model_config: ModelConfig)[source][source]#
Create an input mapper (see
map_input()
) for a specific model.
- register_max_multimodal_tokens(data_type_key: str, max_mm_tokens: int | Callable[[InputContext], int] | None = None)[source][source]#
Register the maximum number of tokens, corresponding to a single instance of multimodal data belonging to a specific modality, that are passed to the language model for a model class.
- register_max_image_tokens(max_mm_tokens: int | Callable[[InputContext], int] | None = None)[source][source]#
Register the maximum number of image tokens, corresponding to a single image, that are passed to the language model for a model class.
- get_max_tokens_per_item_by_modality(model_config: ModelConfig) Mapping[str, int] [source][source]#
Get the maximum number of tokens per data item from each modality based on underlying model configuration.
- get_max_tokens_per_item_by_nonzero_modality(model_config: ModelConfig) Mapping[str, int] [source][source]#
Get the maximum number of tokens per data item from each modality based on underlying model configuration, excluding modalities that user explicitly disabled via limit_mm_per_prompt.
Note
This is currently directly used only in V1 for profiling the memory usage of a model.
- get_max_tokens_by_modality(model_config: ModelConfig) Mapping[str, int] [source][source]#
Get the maximum number of tokens from each modality for profiling the memory usage of a model.
See
MultiModalPlugin.get_max_multimodal_tokens()
for more details.Note
This should be called after
init_mm_limits_per_prompt()
.
- get_max_multimodal_tokens(model_config: ModelConfig) int [source][source]#
Get the maximum number of multi-modal tokens for profiling the memory usage of a model.
See
MultiModalPlugin.get_max_multimodal_tokens()
for more details.Note
This should be called after
init_mm_limits_per_prompt()
.
- init_mm_limits_per_prompt(model_config: ModelConfig) None [source][source]#
Initialize the maximum number of multi-modal input instances for each modality that are allowed per prompt for a model class.
- get_mm_limits_per_prompt(model_config: ModelConfig) Mapping[str, int] [source][source]#
Get the maximum number of multi-modal input instances for each modality that are allowed per prompt for a model class.
Note
This should be called after
init_mm_limits_per_prompt()
.
- register_processor(processor: MultiModalProcessorFactory[_I], *, info: ProcessingInfoFactory[_I], dummy_inputs: DummyInputsBuilderFactory[_I])[source][source]#
Register a multi-modal processor to a model class. The processor is constructed lazily, hence a factory method should be passed.
When the model receives multi-modal data, the provided function is invoked to transform the data into a dictionary of model inputs.
See also
- has_processor(model_config: ModelConfig) bool [source][source]#
Test whether a multi-modal processor is defined for a specific model.
See also
- create_processor(model_config: ModelConfig, tokenizer: transformers.PreTrainedTokenizer | transformers.PreTrainedTokenizerFast | MistralTokenizer) BaseMultiModalProcessor[BaseProcessingInfo] [source][source]#
Create a multi-modal processor for a specific model and tokenizer.
See also