Memory Profiling#

Module Contents#

class vllm.multimodal.profiling.ProcessorInputs(prompt_text: str, mm_data: ~collections.abc.Mapping[str, ~typing.Any | list[typing.Any]], hf_processor_mm_kwargs: ~collections.abc.Mapping[str, object] = <factory>)[source][source]#

Represents the keyword arguments to vllm.multimodal.processing.BaseMultiModalProcessor.apply().

class vllm.multimodal.profiling.BaseDummyInputsBuilder(info: _I)[source][source]#

Abstract base class that constructs the dummy data to profile multi-modal models.

abstract get_dummy_processor_inputs(seq_len: int, mm_counts: Mapping[str, int]) ProcessorInputs[source][source]#

Build the input which, after processing, results in self.info.get_mm_max_tokens_per_item() placeholder tokens.

class vllm.multimodal.profiling.MultiModalProfiler(processor: BaseMultiModalProcessor[_I])[source][source]#

Contains code for running memory profiling for multi-modal models.