Skip to content

Added information about guardrails#6

Open
cableman wants to merge 6 commits into
mainfrom
feature/guardrails
Open

Added information about guardrails#6
cableman wants to merge 6 commits into
mainfrom
feature/guardrails

Conversation

@cableman

Copy link
Copy Markdown
Contributor
  • Added new doc about the message trim guardrail
  • Added support for mermaid diagrams in this docs site

@cableman cableman requested a review from SigneA-hm May 11, 2026 08:15
@cableman cableman force-pushed the feature/guardrails branch from 9a5d7bb to 0ee7642 Compare May 11, 2026 08:18
@lilosti lilosti requested a review from lasseborly May 11, 2026 08:23
@hypesystem

hypesystem commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Added a note related to this here os2ai/helm-deployments#28 (comment)

I'll review this with my feedback in mind.

@hypesystem hypesystem self-requested a review June 3, 2026 08:56
@lilosti

lilosti commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

@SigneA-hm this should be reviewed and merged

@lasseborly lasseborly removed their request for review June 8, 2026 08:02
@lasseborly

Copy link
Copy Markdown

I'll withdraw from this review and let's @hypesystem take the wheel.

@hypesystem hypesystem left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I like the documentation overall, it gives me a good idea of the general approach.

I've added some notes, mostly where I think some more abstract/broader context is warranted, so readers are more likely to understand not just what we are doing, but why 😄

Comment thread technical/guardrails.md
Comment thread technical/guardrails.md
Comment thread technical/guardrails.md
- `_repair_tool_call_pairings` — strip orphan `role: tool` messages and orphan `tool_calls` entries that the trimmer
may have created.
- (Optional, opt-in via `pop_trailing_tool_messages`) pop trailing `role: tool` messages and re-run the repair, then
append `"Please continue"` if the new terminus is an assistant message.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Please continue" feels like it could skew the output, especially if the language in the context window otherwise isn't English?

Comment thread technical/guardrails.md

The message trimming guardrail can be configured in the litellm
values [file](https://github.com/os2ai/helm-deployments/blob/develop/applications/litellm/litellm-values.yaml#L108)
configuration file in the helm chart.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A note on why we need message trimming would make sense after this paragraph, just very briefly. E.g. what happens if oversized message histories are not trimmed, and how does the guardrail avoid it? In a sentence of two.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comment about this to the doc

Comment thread technical/guardrails.md Outdated
Comment on lines +89 to +102
### Why the trailing-tool pop is opt-in

The "normal" agent-loop shape ends on a `role: tool` message:

```mermaid
flowchart LR
U[User] --> A["Assistant{tool_calls}"]
A --> T["Tool{result}"]
T --> C([model is asked to continue here])
```

Most providers (OpenAI, Anthropic, Google, Mistral via the official APIs) __accept__ this shape — that's how tool
calling works. Popping the tool message and substituting `"Please continue"` deprives the model of the result it was
supposed to reason from, so the default is __off__.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this section. Specifically Popping the tool message and substituting "Please continue" deprives the model of the result it was supposed to reason from, so the default is __off__. doesn't really tell me what happens in the cases where the setting is enabled vs disabled, and what exactly the default behavior is.

What is the effect of depriving the model of the result it was supposed to reason from? (Am I understanding it correctly that this refers to "the result of the tool call", and if so, could we call it that?)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to explain it better

Comment thread technical/guardrails.md

## How Message Trimming works

`async_pre_call_hook` runs on every chat completion request. The flow:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before the step-by-step flow, I think a sentence stating what the pros and cons of our approach is would make sense.

E.g. "Sending a too large message to the model can be fatal for the entire conversation, so we take a conservative approach in estimating a safe completion budget for the message" and then explain what a safe completion budget is, why we calculate it as is? I think that would give a lot of good context for evaluating the approach.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intro section added

@hypesystem hypesystem left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in today's meeting, here's a suggested text explaining the context between Open WebUI and LiteLLM and how guardrails are attached.

Comment thread technical/guardrails.md
@cableman cableman force-pushed the feature/guardrails branch from 0ee7642 to d10dc26 Compare June 16, 2026 06:27
@cableman cableman force-pushed the feature/guardrails branch from d10dc26 to ea8c468 Compare June 16, 2026 06:29
@cableman cableman force-pushed the feature/guardrails branch from 0df92e6 to ad9bb8d Compare June 16, 2026 06:33
@cableman cableman requested a review from hypesystem June 16, 2026 07:35
@cableman

Copy link
Copy Markdown
Contributor Author

Tried to answer the questions as good as I can.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants