docs: add note on filtering prompt-elicited inline tags (e.g. <thinking>) before TTS by scttbnsn · Pull Request #976 · pipecat-ai/docs

scttbnsn · 2026-07-02T23:52:54Z

Summary

Follow-up to pipecat-ai/pipecat#4901. When a system prompt asks the LLM to reason inside inline <thinking>...</thinking> tags with extended thinking off, the reasoning streams back as plain text and gets spoken by TTS. @markbackman's investigation there concluded this belongs in docs rather than provider code: prefer native extended thinking, and strip deliberately-elicited inline tags at the text layer. This adds that note.

What changed

New "Removing Custom Inline Tags" section in pipecat/learn/text-to-speech.mdx with the PatternPairAggregator + MatchAction.REMOVE snippet from the issue, plus a Tip pointing at native extended thinking as the preferred path when the goal is genuine reasoning.
A short bullet in the Notes of api-reference/server/services/llm/anthropic.mdx cross-linking the new section, since that's where someone debugging spoken thinking text with Anthropic looks first.

Verification

Snippet imports and behavior checked against pipecat main: LLMTextProcessor(text_aggregator=...), add_pattern(..., action=MatchAction.REMOVE), and LLMThoughtTextFrame routing in AnthropicLLMService.
npx prettier clean on both files.

- 📝 docs(learn): add 'Removing Custom Inline Tags' section to text-to-speech page with PatternPairAggregator + MatchAction.REMOVE snippet and a Tip preferring native extended thinking - 📝 docs(api-reference): cross-link the new section from the Anthropic service Notes Addresses pipecat-ai/pipecat#4901

markbackman · 2026-07-03T00:23:00Z


 - **Prompt caching**: When `enable_prompt_caching` is enabled, Anthropic caches repeated context to reduce costs. Cache control markers are automatically added to the most recent user messages. This is most effective for conversations with large system prompts or long conversation histories.
 - **Extended thinking**: Enabling thinking increases response quality for complex tasks but adds latency. When `type="enabled"`, you must provide a `budget_tokens` value (minimum 1024 with current models). Extended thinking is disabled by default.
+- **Prompt-elicited `<thinking>` tags**: If your system prompt asks the model to reason inside inline tags rather than enabling extended thinking, that reasoning is ordinary text and will be spoken by TTS. Prefer the `thinking` parameter; for inline tags you deliberately keep, see [Removing Custom Inline Tags](/pipecat/learn/text-to-speech#removing-custom-inline-tags).


This note makes sense.

Rather than adding a new subsection to the learning guides, it might make sense to just point the developer directly to the PatternPairAggregator. In the PatternPairAggregator, we can add a new, generic section about removing tags, which we can link to. Something like this would do the trick:

### Removing Tagged Content To drop content from the text stream entirely, register a pattern with `MatchAction.REMOVE`. The tags and everything between them are removed before reaching downstream processors — nothing is spoken by TTS and nothing lands in the conversation context. This is useful when your prompt elicits inline tags whose content is not meant for the user, such as reasoning tags (e.g., `<thinking>...</thinking>`) or annotations intended for other processors: ```python from pipecat.processors.aggregators.llm_text_processor import LLMTextProcessor from pipecat.utils.text.pattern_pair_aggregator import MatchAction, PatternPairAggregator pattern_aggregator = PatternPairAggregator() pattern_aggregator.add_pattern( type="thinking", start_pattern="<thinking>", end_pattern="</thinking>", action=MatchAction.REMOVE, ) # Set the aggregator on an LLMTextProcessor llm_text_processor = LLMTextProcessor(text_aggregator=pattern_aggregator) # add the llm_text_processor to your pipeline after the llm and before the tts # llm -> llm_text_processor -> tts

Because this filters the text stream itself, it works with any LLM provider and any custom inline tag.

Done in b117685. Used your text as a new "Removing Tagged Content" example on the PatternPairAggregator page, dropped the learn-guide section, and pointed the anthropic note at the new anchor.

markbackman · 2026-07-03T00:23:13Z

 # llm -> llm_text_processor -> tts
 ```

+### Removing Custom Inline Tags


From the other comment, I think we'll want to remove this.

- 🗑️ remove(learn): drop the new text-to-speech section per review - 📝 docs(api-reference): add 'Removing Tagged Content' usage example to pattern-pair-aggregator - 🔄 refactor(api-reference): point the Anthropic note at the new anchor Review feedback from pipecat-ai#976

markbackman

LGTM! Thanks for taking care of this 🙇

scttbnsn mentioned this pull request Jul 2, 2026

AnthropicLLMService does not route inline <thinking> text to LLMThoughtTextFrame (leaks to TTS) pipecat-ai/pipecat#4901

Closed

markbackman reviewed Jul 3, 2026

View reviewed changes

markbackman approved these changes Jul 3, 2026

View reviewed changes

markbackman merged commit b4cb081 into pipecat-ai:main Jul 3, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs: add note on filtering prompt-elicited inline tags (e.g. <thinking>) before TTS#976

docs: add note on filtering prompt-elicited inline tags (e.g. <thinking>) before TTS#976
markbackman merged 2 commits into
pipecat-ai:mainfrom
scttbnsn:docs/filter-custom-inline-tags

scttbnsn commented Jul 2, 2026

Uh oh!

markbackman Jul 3, 2026

Uh oh!

scttbnsn Jul 3, 2026

Uh oh!

markbackman Jul 3, 2026

Uh oh!

scttbnsn Jul 3, 2026

Uh oh!

markbackman left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

scttbnsn commented Jul 2, 2026

Summary

What changed

Verification

Uh oh!

markbackman Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

scttbnsn Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

markbackman Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

scttbnsn Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

markbackman left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants