vllm.model_executor.models.adapters
_GENERATE_SUFFIXES
module-attribute
¶
_create_pooling_model_cls
¶
_create_pooling_model_cls(
orig_cls: _T,
*,
default_pooling_type: PoolingType,
default_normalize: bool,
default_softmax: bool,
) -> _T
Source code in vllm/model_executor/models/adapters.py
_get_pooling_model_name
¶
Source code in vllm/model_executor/models/adapters.py
as_classification_model
¶
Subclass an existing vLLM model to support classification.
By default, the class probabilities are extracted from the softmaxed hidden state corresponding to the last token.
Note
We assume that the classification head is a single linear layer
stored as the attribute score
of the top-level model;
please implement your own model if this is not the case.
Source code in vllm/model_executor/models/adapters.py
as_embedding_model
¶
Subclass an existing vLLM model to support embeddings.
By default, the embeddings of the whole prompt are extracted from the normalized hidden state corresponding to the last token.
Note
We assume that no extra layers are added to the original model; please implement your own model if this is not the case.
Source code in vllm/model_executor/models/adapters.py
as_reward_model
¶
Subclass an existing vLLM model to support reward modeling.
By default, we return the hidden states of each token directly.
Note
We assume that no extra layers are added to the original model; please implement your own model if this is not the case.