Registry#
Module Contents#
- class vllm.multimodal.registry.ProcessingInfoFactory(*args, **kwargs)[source]#
Constructs a
MultiModalProcessorinstance from the context.
- class vllm.multimodal.registry.DummyInputsBuilderFactory(*args, **kwargs)[source]#
Constructs a
BaseDummyInputsBuilderinstance from the context.
- class vllm.multimodal.registry.MultiModalProcessorFactory(*args, **kwargs)[source]#
Constructs a
MultiModalProcessorinstance from the context.
- class vllm.multimodal.registry.MultiModalRegistry[source]#
A registry that dispatches data processing according to the model.
- get_max_tokens_per_item_by_modality(model_config: ModelConfig) Mapping[str, int][source]#
Get the maximum number of tokens per data item from each modality based on underlying model configuration.
- get_max_tokens_per_item_by_nonzero_modality(model_config: ModelConfig) Mapping[str, int][source]#
Get the maximum number of tokens per data item from each modality based on underlying model configuration, excluding modalities that user explicitly disabled via limit_mm_per_prompt.
Note
This is currently directly used only in V1 for profiling the memory usage of a model.
- get_max_tokens_by_modality(model_config: ModelConfig) Mapping[str, int][source]#
Get the maximum number of tokens from each modality for profiling the memory usage of a model.
See
MultiModalPlugin.get_max_multimodal_tokens()for more details.
- get_max_multimodal_tokens(model_config: ModelConfig) int[source]#
Get the maximum number of multi-modal tokens for profiling the memory usage of a model.
See
MultiModalPlugin.get_max_multimodal_tokens()for more details.
- get_mm_limits_per_prompt(model_config: ModelConfig) Mapping[str, int][source]#
Get the maximum number of multi-modal input instances for each modality that are allowed per prompt for a model class.
- register_processor(processor: MultiModalProcessorFactory[_I], *, info: ProcessingInfoFactory[_I], dummy_inputs: DummyInputsBuilderFactory[_I])[source]#
Register a multi-modal processor to a model class. The processor is constructed lazily, hence a factory method should be passed.
When the model receives multi-modal data, the provided function is invoked to transform the data into a dictionary of model inputs.
See also
- create_processor(model_config: ModelConfig, *, tokenizer: transformers.PreTrainedTokenizer | transformers.PreTrainedTokenizerFast | TokenizerBase | None = None, disable_cache: bool | None = None) BaseMultiModalProcessor[BaseProcessingInfo][source]#
Create a multi-modal processor for a specific model and tokenizer.
See also