Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions api-reference/server/services/llm/anthropic.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -201,6 +201,7 @@ await worker.queue_frame(

- **Prompt caching**: When `enable_prompt_caching` is enabled, Anthropic caches repeated context to reduce costs. Cache control markers are automatically added to the most recent user messages. This is most effective for conversations with large system prompts or long conversation histories.
- **Extended thinking**: Enabling thinking increases response quality for complex tasks but adds latency. When `type="enabled"`, you must provide a `budget_tokens` value (minimum 1024 with current models). Extended thinking is disabled by default.
- **Prompt-elicited `<thinking>` tags**: If your system prompt asks the model to reason inside inline tags rather than enabling extended thinking, that reasoning is ordinary text and will be spoken by TTS. Prefer the `thinking` parameter; for inline tags you deliberately keep, see [Removing Tagged Content](/api-reference/server/utilities/text/pattern-pair-aggregator#removing-tagged-content).
- **Custom clients**: You can pass custom Anthropic client instances (e.g., `AsyncAnthropicBedrock` or `AsyncAnthropicVertex`) via the `client` parameter to use Anthropic models through other cloud providers.
- **Retry behavior**: When `retry_on_timeout=True`, the first attempt uses the `retry_timeout_secs` timeout. If it times out, a second attempt is made with no timeout limit.
- **System instruction precedence**: If both `system_instruction` (from the constructor) and a system message in the context are set, the constructor's `system_instruction` takes precedence and a warning is logged.
Expand Down
25 changes: 25 additions & 0 deletions api-reference/server/utilities/text/pattern-pair-aggregator.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,31 @@ When a pattern is matched, the handler function receives a `PatternMatch` object

## Usage Examples

### Removing Tagged Content

To drop content from the text stream entirely, register a pattern with `MatchAction.REMOVE`. The tags and everything between them are removed before reaching downstream processors — nothing is spoken by TTS and nothing lands in the conversation context. This is useful when your prompt elicits inline tags whose content is not meant for the user, such as reasoning tags (e.g., `<thinking>...</thinking>`) or annotations intended for other processors:

```python
from pipecat.processors.aggregators.llm_text_processor import LLMTextProcessor
from pipecat.utils.text.pattern_pair_aggregator import MatchAction, PatternPairAggregator

pattern_aggregator = PatternPairAggregator()
pattern_aggregator.add_pattern(
type="thinking",
start_pattern="<thinking>",
end_pattern="</thinking>",
action=MatchAction.REMOVE,
)

# Set the aggregator on an LLMTextProcessor
llm_text_processor = LLMTextProcessor(text_aggregator=pattern_aggregator)

# add the llm_text_processor to your pipeline after the llm and before the tts
# llm -> llm_text_processor -> tts
```

Because this filters the text stream itself, it works with any LLM provider and any custom inline tag.

### Voice Switching in TTS

This example demonstrates finding custom `<voice>` tags in streaming text to switch voices dynamically in a TTS service like Cartesia. It removes the tags and the content between them, such that the content is treated as if it does not exist. It will not be spoken by the TTS, it will not be added to the context, and it will not be sent to clients via RTVI. Instead, it simply triggers a voice switch side effect.
Expand Down
Loading