Skip to content

Python: Bambriz context provider#5139

Draft
bambriz wants to merge 2 commits intomicrosoft:mainfrom
bambriz:bambriz-context-provider
Draft

Python: Bambriz context provider#5139
bambriz wants to merge 2 commits intomicrosoft:mainfrom
bambriz:bambriz-context-provider

Conversation

@bambriz
Copy link
Copy Markdown

@bambriz bambriz commented Apr 7, 2026

Motivation and Context

This PR adds AzureCosmosContextProvider to the agent-framework-azure-cosmos package.

The goal is to enable Azure Cosmos DB-backed retrieval for Agent Framework sessions, so an agent can pull relevant context before a run and write conversation data back after the run. This brings the Cosmos package beyond history persistence and adds a retrieval-oriented context provider pattern similar to other context provider integrations in the repo.

This change contributes to scenarios where:

  • application data or knowledge is already stored in Azure Cosmos DB
  • developers want to use Cosmos DB as a retrieval source for agent context
  • retrieved context and conversation writeback should live in the same knowledge container
  • a single provider instance should be reusable across multiple agent runs, while still allowing optional per-run retrieval tuning

It also keeps the API intentionally simple by avoiding extra query-builder-style configuration and instead following the same general pattern used by other context provider packages.

Description

This PR introduces AzureCosmosContextProvider in the Azure Cosmos package.

The provider adds two main behaviors:

  • before_run(...): retrieves relevant context from an existing Cosmos container and injects it into the session
  • after_run(...): writes eligible input and response messages back into the configured Cosmos container after every run

Key design choices in this implementation:

  • Retrieval input is built by joining the filtered user and assistant messages from the current run into a single query string.
  • The constructor is focused on durable provider configuration:
    • Cosmos connection/container settings
    • field mappings
    • default search mode
    • default top_k
    • default scan_limit
    • default partition_key
    • optional vector configuration for vector/hybrid retrieval
  • Optional per-run retrieval tuning is supported through before_run(...), including:
    • search_mode
    • weights
    • top_k
    • scan_limit
    • partition_key

This allows normal agent usage to stay simple while still supporting advanced scenarios where one provider instance is reused across multiple runs with different retrieval settings.

The provider supports:

  • full-text retrieval
  • vector retrieval
  • hybrid retrieval using RRF weights

Writeback is always performed in after_run(...), so the knowledge container can accumulate conversation data over time. Those writeback documents are tagged with an internal document type and excluded from retrieval queries.

This PR also updates the Azure Cosmos package docs, samples, and tests to document and validate the new provider behavior.

Important usage guidance called out in the docs:

  • the application owner is responsible for configuring the Cosmos account, database, container, partitioning strategy, and any required full-text/vector/hybrid indexing policies
  • the provider does not create or manage Cosmos resources
  • constructor defaults define normal attached-agent behavior
  • callers can optionally override retrieval settings in before_run(...) for a specific run

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

@moonbox3 moonbox3 added documentation Improvements or additions to documentation python labels Apr 7, 2026
@github-actions github-actions bot changed the title Bambriz context provider Python: Bambriz context provider Apr 7, 2026
DEFAULT_CONTEXT_PROMPT: ClassVar[str] = "Use the following context to answer the question:"
_DEFAULT_RESULT_LIMIT: ClassVar[int] = 5
_DEFAULT_SCAN_LIMIT: ClassVar[int] = 25
_DEFAULT_SEARCH_MODE: ClassVar[CosmosContextSearchMode] = CosmosContextSearchMode.FULL_TEXT
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default should be vector

url_field_name: str | None = "url",
message_field_name: str | None = "message",
metadata_field_name: str | None = "metadata",
vector_field_name: str | None = None,
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should have default vector field name

self._cosmos_client = CosmosClient(
url=settings["endpoint"], # type: ignore[arg-type]
credential=credential or settings["key"].get_secret_value(), # type: ignore[arg-type,union-attr]
user_agent_suffix=AGENT_FRAMEWORK_USER_AGENT,
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add something related to cosmos db python (to track what cosmos sdk is being used)

scan_limit: int | None


class AzureCosmosContextProvider(BaseContextProvider):
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

basecontextproivder is now contextproivder

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants