Python client for otari-gateway. Communicate with any LLM provider through the gateway using a single, typed interface.
Generate an API token at otari.ai/organization-settings/api-tokens, then add a provider key (e.g. OpenAI) at otari.ai/organization-settings/provider-keys so the gateway can route requests to that provider. Then use the client:
from otari import OtariClient
client = OtariClient(
platform_token="tk_your_api_token",
)
response = client.completion(
model="openai:gpt-4o-mini",
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)That's it! With no api_base, the client defaults to the hosted gateway at https://api.otari.ai. Change the model string to switch between LLM providers through the gateway.
Prefer async? Use AsyncOtariClient, which exposes the same API with await (see Async usage).
Prefer to keep secrets out of code? Set OTARI_AI_TOKEN in your environment and OtariClient() picks up the token automatically.
Prefer to run the gateway yourself instead of using the hosted otari.ai? Follow the setup in the otari gateway repo, then point the SDK at it:
client = OtariClient(
api_base="http://localhost:8000", # or wherever you host the gateway
api_key="your-gateway-api-key",
)The SDK sends api_key via the custom Otari-Key: Bearer … header. Env: GATEWAY_API_BASE + GATEWAY_API_KEY.
Make sure your gateway has provider keys configured (e.g. OpenAI) so it can route requests upstream — see the otari gateway repo for setup.
- Python 3.11 or newer
- A running otari-gateway instance
pip install otariFor the hosted gateway, set your platform token (no api_base needed — it defaults to https://api.otari.ai):
export OTARI_AI_TOKEN="tk_your_api_token"GATEWAY_PLATFORM_TOKEN is kept as a legacy alias for OTARI_AI_TOKEN; the canonical name takes precedence when both are set.
For a self-hosted gateway, set the base URL and an API key instead:
export GATEWAY_API_BASE="http://localhost:8000"
export GATEWAY_API_KEY="your-key-here"Alternatively, pass credentials directly when creating the client (see Usage examples).
This Python SDK is a client for otari-gateway, an optional FastAPI-based proxy server that adds enterprise-grade features on top of the core library:
- Budget Management - Enforce spending limits with automatic daily, weekly, or monthly resets
- API Key Management - Issue, revoke, and monitor virtual API keys without exposing provider credentials
- Usage Analytics - Track every request with full token counts, costs, and metadata
- Multi-tenant Support - Manage access and budgets across users and teams
The gateway sits between your applications and LLM providers, exposing an OpenAI-compatible API that works with any supported provider.
docker run \
-e GATEWAY_MASTER_KEY="your-secure-master-key" \
-e OPENAI_API_KEY="your-api-key" \
-p 8000:8000 \
ghcr.io/mozilla-ai/otari/gateway:latestNote: You can use a specific release version instead of
latest(e.g.,1.2.0). See available versions.
Prefer a hosted experience? The otari platform provides a managed control plane for keys, usage tracking, and cost visibility across providers, while still building on the same otari interfaces.
Migrating from a previous version?
OtariClientis now synchronous — call its methods directly (noawait). For asynchronous code, switch toAsyncOtariClient, which keeps the previousawait-based API. See Async usage.
The client supports two authentication modes, matching the TypeScript SDK:
Uses a Bearer token in the standard Authorization header. On the hosted platform, generate an API token at otari.ai/organization-settings/api-tokens and add a provider key (e.g. OpenAI) at otari.ai/organization-settings/provider-keys so the gateway can route requests to that provider. With no api_base, the client defaults to the hosted gateway at https://api.otari.ai:
client = OtariClient(
platform_token="tk_your_api_token",
)Sends the API key via a custom Otari-Key header. This targets a self-hosted gateway, so an explicit api_base is required:
client = OtariClient(
api_base="http://localhost:8000",
api_key="your-api-key",
)When no explicit credentials are provided, the client reads from environment variables:
# Platform mode: OTARI_AI_TOKEN (or legacy GATEWAY_PLATFORM_TOKEN),
# defaulting to the hosted gateway.
# Self-hosted: GATEWAY_API_BASE + GATEWAY_API_KEY.
client = OtariClient()response = client.completion(
model="openai:gpt-4o-mini",
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)stream = client.completion(
model="openai:gpt-4o-mini",
messages=[{"role": "user", "content": "Tell me a story."}],
stream=True,
)
for chunk in stream:
content = chunk.choices[0].delta.content
if content:
print(content, end="", flush=True)response = client.response(
model="openai:gpt-4o-mini",
input="Summarize this in one sentence.",
)
print(response.output_text)result = client.embedding(
model="openai:text-embedding-3-small",
input="Hello world",
)
print(result.data[0].embedding)models = client.list_models()
for model in models:
print(model.id)Every method on OtariClient has an asynchronous counterpart on AsyncOtariClient. It accepts the same constructor arguments and exposes the same methods, but they are coroutines you await (and streams are async iterables):
import asyncio
from otari import AsyncOtariClient
async def main() -> None:
async with AsyncOtariClient(platform_token="tk_your_api_token") as client:
response = await client.completion(
model="openai:gpt-4o-mini",
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)
stream = await client.completion(
model="openai:gpt-4o-mini",
messages=[{"role": "user", "content": "Tell me a story."}],
stream=True,
)
async for chunk in stream:
content = chunk.choices[0].delta.content
if content:
print(content, end="", flush=True)
asyncio.run(main())In platform mode, HTTP errors are mapped to typed exceptions:
from otari import OtariClient, AuthenticationError, RateLimitError
try:
response = client.completion(
model="openai:gpt-4o-mini",
messages=[{"role": "user", "content": "Hello!"}],
)
except AuthenticationError as e:
print(f"Invalid credentials: {e.message}")
except RateLimitError as e:
print(f"Rate limited, retry after: {e.retry_after}")| HTTP Status | Error Class | Description |
|---|---|---|
| 400 (capability) | UnsupportedCapabilityError |
Selected provider does not support the requested capability |
| 401, 403 | AuthenticationError |
Invalid or missing credentials |
| 402 | InsufficientFundsError |
Budget or credits exhausted |
| 404 | ModelNotFoundError |
Model not found, or no provider key configured for the requested provider. The exception's message carries the gateway's detail. |
| 429 | RateLimitError |
Rate limit exceeded (includes retry_after) |
| 502 | UpstreamProviderError |
Upstream provider unreachable |
| 504 | GatewayTimeoutError |
Gateway timed out waiting for provider |
UnsupportedCapabilityError surfaces in both platform and non-platform modes; the other mappings are platform-mode only.
The client supports a context manager for automatic cleanup:
with OtariClient(api_base="http://localhost:8000") as client:
response = client.completion(
model="openai:gpt-4o-mini",
messages=[{"role": "user", "content": "Hello!"}],
)AsyncOtariClient supports the async equivalent:
async with AsyncOtariClient(api_base="http://localhost:8000") as client:
response = await client.completion(
model="openai:gpt-4o-mini",
messages=[{"role": "user", "content": "Hello!"}],
)- Simple, unified interface - Single client for all providers through the gateway, switch models with just a string change
- Developer friendly - Full type hints for better IDE support and clear, actionable error messages
- Leverages the OpenAI SDK - Built on the official OpenAI Python SDK for maximum compatibility
- Sync and async - Use the synchronous
OtariClientor the asynchronousAsyncOtariClient, both with the same typed interface - Stays framework-agnostic so it can be used across different projects and use cases
- Battle-tested - Powers our own production tools (any-agent)
# Create a virtual environment
python -m venv .venv
source .venv/bin/activate
# Install with dev dependencies
pip install -e ".[dev]"
# Run unit tests
pytest tests/
# Lint
ruff check src/ tests/
# Type-check
mypy src/- Full Documentation - Complete guides and API reference
- Supported Providers - List of all supported LLM providers
- Gateway Documentation - Gateway setup and deployment
- TypeScript SDK - The TypeScript SDK for Node.js applications
- otari Platform (Beta) - Hosted control plane for key management, usage tracking, and cost visibility
We welcome contributions from developers of all skill levels! Please see the Contributing Guide or open an issue to discuss changes.
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.