vllm.v1.structured_output.backend_types ¶
Classes:
-
StructuredOutputBackend–Engine-level backend for structured output requests.
-
StructuredOutputGrammar–Request-level backend for structured output requests.
StructuredOutputBackend dataclass ¶
Bases: ABC
Engine-level backend for structured output requests.
Methods:
-
allocate_token_bitmask–Allocates a token bitmask for the specified maximum number of sequences.
-
compile_grammar–Compiles a grammar specification into a structured output grammar.
-
destroy–Backend-specific cleanup.
Source code in vllm/v1/structured_output/backend_types.py
allocate_token_bitmask(max_num_seqs) abstractmethod ¶
Allocates a token bitmask for the specified maximum number of sequences.
Parameters:
Source code in vllm/v1/structured_output/backend_types.py
compile_grammar(request_type, grammar_spec) abstractmethod ¶
Compiles a grammar specification into a structured output grammar.
Parameters:
-
(request_type¶StructuredOutputOptions) –The type of structured output request.
-
(grammar_spec¶str) –The grammar specification to compile.
Returns:
-
StructuredOutputGrammar(StructuredOutputGrammar) –The compiled structured output grammar.
Source code in vllm/v1/structured_output/backend_types.py
StructuredOutputGrammar ¶
Bases: ABC
Request-level backend for structured output requests.
Methods:
-
accept_tokens–Determines whether the provided tokens are accepted for the
-
fill_bitmask–Fills the bitmask for a specific batch index.
-
is_terminated–Checks whether the structured output process has terminated.
-
reset–Resets the state of the structured output grammar.
-
rollback–Rolls back the state of the grammar by a specified number of tokens.
-
validate_tokens–Validates the provided tokens against the grammar.
Source code in vllm/v1/structured_output/backend_types.py
accept_tokens(request_id, tokens) abstractmethod ¶
Determines whether the provided tokens are accepted for the given request.
Parameters:
-
(request_id¶str) –The unique identifier for the request.
-
(tokens¶list[int]) –A list of token IDs to evaluate.
Returns:
-
bool(bool) –True if the tokens are accepted, False otherwise.
Source code in vllm/v1/structured_output/backend_types.py
fill_bitmask(bitmask, batch_index) abstractmethod ¶
Fills the bitmask for a specific batch index.
Parameters:
Source code in vllm/v1/structured_output/backend_types.py
is_terminated() abstractmethod ¶
Checks whether the structured output process has terminated.
Returns:
-
bool(bool) –True if the process is terminated, False otherwise.
reset() abstractmethod ¶
rollback(num_tokens) abstractmethod ¶
Rolls back the state of the grammar by a specified number of tokens. Will also revert counters for the number of processed tokens.
Parameters:
Source code in vllm/v1/structured_output/backend_types.py
validate_tokens(tokens) abstractmethod ¶
Validates the provided tokens against the grammar. Will not advance the FSM.
Parameters:
Returns:
-
list[int]–list[int]: A list of accepted token IDs. Will be a prefix of the input tokens, and empty if none are accepted.