vllm_omni.reasoning.step_audio_reasoning_parser ¶
StepAudioReasoningParser ¶
Bases: ReasoningParser
Reasoning parser for Step-Audio models.
Step-Audio supports two representations of thinking markers:
-
Special tokens:
<|THINK_START|>and<|THINK_END|>(single-token IDs, e.g. 151669 and 151670). -
Text markers:
`` and `````` (multi-token sequences, e.g.
think_end_token_id instance-attribute ¶
think_end_token_id: int = (
think_end_special_id
if think_end_special_id != -1
else think_end_text_id
)
think_start_special_id instance-attribute ¶
think_start_special_id: int = get(THINK_START_SPECIAL, -1)
think_start_token_id instance-attribute ¶
think_start_token_id: int = (
think_start_special_id
if think_start_special_id != -1
else think_start_text_id
)
count_reasoning_tokens ¶
Count tokens within thinking spans.
extract_reasoning ¶
extract_reasoning(
model_output: str,
request: ChatCompletionRequest | ResponsesRequest,
) -> tuple[str | None, str | None]
extract_reasoning_streaming ¶
extract_reasoning_streaming(
previous_text: str,
current_text: str,
delta_text: str,
previous_token_ids: Sequence[int],
current_token_ids: Sequence[int],
delta_token_ids: Sequence[int],
) -> DeltaMessage | None
is_reasoning_end ¶
Check if reasoning has ended in the given token sequence.
When called with prompt token IDs (by the serving layer), the prompt may contain think markers from previous assistant turns. In multi-turn conversations the prompt can include both start and end markers, with the last marker being a start marker (from the generation prompt). In that case reasoning is NOT ended — the model is about to generate inside a new think block.
We therefore find the last think marker (start or end) in the decoded text and only return True if it is an end marker.