vllm_omni.reasoning.step_audio_reasoning_parser ¶

logger `module-attribute` ¶

logger = init_logger(__name__)

StepAudioReasoningParser ¶

Bases: ReasoningParser

Reasoning parser for Step-Audio models.

Step-Audio supports two representations of thinking markers:

Special tokens: <|THINK_START|> and <|THINK_END|> (single-token IDs, e.g. 151669 and 151670).
Text markers: `` and `````` (multi-token sequences, e.g.

THINK_END_SPECIAL `class-attribute` `instance-attribute` ¶

THINK_END_SPECIAL = '<|THINK_END|>'

THINK_END_TEXT `class-attribute` `instance-attribute` ¶

THINK_END_TEXT = '</think>'

THINK_START_SPECIAL `class-attribute` `instance-attribute` ¶

THINK_START_SPECIAL = '<|THINK_START|>'

THINK_START_TEXT `class-attribute` `instance-attribute` ¶

THINK_START_TEXT = '<think>'

think_end_special_id `instance-attribute` ¶

think_end_special_id: int = self.vocab.get(
    self.THINK_END_SPECIAL, -1
)

think_end_text_id `instance-attribute` ¶

think_end_text_id: int = self.vocab.get(
    self.THINK_END_TEXT, -1
)

think_end_token `instance-attribute` ¶

think_end_token = self.THINK_END_TEXT

think_end_token_id `instance-attribute` ¶

think_end_token_id: int = (
    self.think_end_special_id
    if self.think_end_special_id != -1
    else self.think_end_text_id
)

think_start_special_id `instance-attribute` ¶

think_start_special_id: int = self.vocab.get(
    self.THINK_START_SPECIAL, -1
)

think_start_text_id `instance-attribute` ¶

think_start_text_id: int = self.vocab.get(
    self.THINK_START_TEXT, -1
)

think_start_token `instance-attribute` ¶

think_start_token = self.THINK_START_TEXT

think_start_token_id `instance-attribute` ¶

think_start_token_id: int = (
    self.think_start_special_id
    if self.think_start_special_id != -1
    else self.think_start_text_id
)

count_reasoning_tokens ¶

count_reasoning_tokens(token_ids: Sequence[int]) -> int

Count tokens within thinking spans.

extract_content_ids ¶

extract_content_ids(input_ids: list[int]) -> list[int]

extract_reasoning ¶

extract_reasoning(
    model_output: str,
    request: ChatCompletionRequest | ResponsesRequest,
) -> tuple[str | None, str | None]

extract_reasoning_streaming ¶

extract_reasoning_streaming(
    previous_text: str,
    current_text: str,
    delta_text: str,
    previous_token_ids: Sequence[int],
    current_token_ids: Sequence[int],
    delta_token_ids: Sequence[int],
) -> DeltaMessage | None

is_reasoning_end ¶

is_reasoning_end(input_ids: Sequence[int]) -> bool

Check if reasoning has ended in the given token sequence.

When called with prompt token IDs (by the serving layer), the prompt may contain think markers from previous assistant turns. In multi-turn conversations the prompt can include both start and end markers, with the last marker being a start marker (from the generation prompt). In that case reasoning is NOT ended — the model is about to generate inside a new think block.

We therefore find the last think marker (start or end) in the decoded text and only return True if it is an end marker.

is_reasoning_end_streaming ¶

is_reasoning_end_streaming(
    input_ids: Sequence[int], delta_ids: Iterable[int]
) -> bool

vllm_omni.reasoning.step_audio_reasoning_parser ¶

logger module-attribute ¶

StepAudioReasoningParser ¶

THINK_END_SPECIAL class-attribute instance-attribute ¶

THINK_END_TEXT class-attribute instance-attribute ¶

THINK_START_SPECIAL class-attribute instance-attribute ¶

THINK_START_TEXT class-attribute instance-attribute ¶

think_end_special_id instance-attribute ¶

think_end_text_id instance-attribute ¶

think_end_token instance-attribute ¶

think_end_token_id instance-attribute ¶

think_start_special_id instance-attribute ¶

think_start_text_id instance-attribute ¶

think_start_token instance-attribute ¶

think_start_token_id instance-attribute ¶