This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Version bumping is fully automated via GitHub Actions. On every push to main:
-
Analyzes commits using Conventional Commits:
feat:→ Minor bump (0.1.x → 0.2.0)fix:,perf:,refactor:→ Patch bump (0.1.3 → 0.1.4)feat!:,BREAKING CHANGE:→ Major bump (0.x.x → 1.0.0)
-
Auto-updates:
Cargo.tomlversionCHANGELOG.mdwith categorized changes- Creates git tag
v0.1.4
-
Triggers release pipeline → builds binaries, Docker images, publishes to crates.io
<type>(<scope>): <description>
[optional body]
[optional footer]
Types: feat, fix, docs, style, refactor, perf, test, chore
Examples:
git commit -m "feat(router): add weighted load balancing" # → minor
git commit -m "fix(auth): resolve JWT validation issue" # → patch
git commit -m "feat!: redesign provider interface" # → majorAccess build information programmatically:
use litellm_rs::{VERSION, GIT_HASH, full_version, build_info};
println!("Version: {}", VERSION); // "0.1.4"
println!("Full: {}", full_version()); // "0.1.4-a1b2c3d"
println!("Info: {}", build_info()); // "0.1.4-a1b2c3d (built 1704067200 with rustc 1.87)"# 1. Update version
cargo set-version 0.1.4
# 2. Update CHANGELOG.md
# 3. Commit and tag
git add Cargo.toml Cargo.lock CHANGELOG.md
git commit -m "chore(release): bump version to 0.1.4"
git tag -a v0.1.4 -m "Release v0.1.4"
git push && git push --tagsmake version # Show current version info
cargo pkgid # Show package identifier with version- Start development:
make devorcargo run(auto-loads config/gateway.yaml) - Build:
cargo build --all-features - Test:
cargo test --all-features - Lint:
cargo clippy --all-targets --all-features -- -D warnings - Format:
cargo fmt --all - Quick start:
make start(fastest way to start the gateway)
- All tests:
make test - Unit tests only:
make test-unit - Integration tests:
make test-integration - Test coverage:
make test-coverage - Single test:
cargo test <test_name> --all-features
- Start dev services:
make dev-services(starts PostgreSQL, Redis) - Stop dev services:
make dev-stop - Database migration:
make db-migrate - Reset database:
make db-reset
This is a high-performance AI Gateway written in Rust that provides OpenAI-compatible APIs with intelligent routing across 100+ AI providers. It's a Rust implementation of the Python LiteLLM library, designed for production environments requiring maximum throughput and minimal latency.
Gateway Architecture: Modular, trait-based design with dependency injection
src/core/- Central orchestrator and business logicsrc/server/- Actix-web HTTP server with middleware pipelinesrc/auth/- Multi-layered authentication (JWT, API keys, RBAC)src/core/providers/- Pluggable provider system (OpenAI, Anthropic, Azure, Google, etc.)src/core/router/- Intelligent routing with multiple strategiessrc/core/mcp/- MCP Gateway for external tool integration (90 tests)src/core/a2a/- A2A Protocol for agent-to-agent communication (48 tests)src/storage/- Multi-backend storage (PostgreSQL, Redis, S3, Vector DB)src/monitoring/- Observability (Prometheus, tracing, health checks)
- Async-first: All I/O is non-blocking using Tokio
- Trait-based abstractions: Pluggable components via traits
- Error handling: Comprehensive error types with context preservation
- Configuration: Type-safe config models with Default implementations
- No backward compatibility: Break old formats freely - prioritize clean architecture over legacy support
- Unified Provider trait: Common interface for all AI providers
- Format conversion: Automatic translation between OpenAI and provider-specific APIs
- Health monitoring: Per-provider health checks and failover
- Cost calculation: Built-in token counting and cost estimation
- HTTP Request → Authentication → Authorization → Router → Provider → Response
- Middleware pipeline handles auth, logging, metrics, and transformations
- Intelligent routing selects optimal provider based on health, latency, cost
- Main config:
config/gateway.yaml(auto-loaded by default) - Example config:
config/gateway.yaml.example - Environment variables: Override config values with
${ENV_VAR}syntax - Config validation:
make config-validate
src/main.rs- Application entry pointsrc/lib.rs- Library entry point with core Gateway struct and Python LiteLLM compatible exportsCargo.toml- Dependencies and features (use--all-featuresfor development)Makefile- All development commands and workflowsconfig/gateway.yaml- Main configuration file
gateway(default) - Main gateway servergoogle-gateway- Specialized Google API gateway
The codebase uses Cargo features extensively:
- Storage:
postgres,sqlite,redis,s3 - Monitoring:
metrics,tracing - Advanced:
vector-db,websockets,analytics,enterprise - Development: Use
--all-featuresflag for full functionality
- Primary DB: PostgreSQL with Sea-ORM migrations
- Cache: Redis for high-speed operations
- File storage: S3-compatible object storage
- Vector DB: Optional Qdrant integration for semantic caching
- Unit tests in each module (
#[cfg(test)]) - Test files use inline tests within source files
- Postman collections for API testing (
tests/*.postman_collection.json) - Mock implementations for external services
Providers are split into two tiers based on whether they need custom Rust code.
A provider belongs in Tier 1 when all of the following are true:
- The remote API is OpenAI-compatible (
/v1/chat/completions, standard request/response shape) - No custom request transformation is needed (no special headers, param filtering, or model-name mangling)
- No custom streaming logic is needed (standard SSE with
data: [DONE]) - No provider-specific model metadata is required at runtime
How to add a Tier 1 provider: add a single def() entry in
src/core/providers/registry/catalog.rs and a commented annotation in
src/core/providers/mod.rs:
// in catalog.rs
def("myprovider", "My Provider", "https://api.myprovider.com/v1", "MYPROVIDER_API_KEY"),
// in mod.rs
// myprovider: Tier 1 -> registry/catalog.rsNo other files need to change. The factory in src/core/providers/factory/mod.rs
automatically routes Tier 1 names through OpenAILikeProvider.
A provider requires Tier 2 treatment when any of the following apply:
- Non-OpenAI request/response format (e.g., Anthropic, Gemini, Cohere, Bedrock)
- Custom HTTP client with auth signing (e.g., AWS SigV4 for Bedrock, SageMaker)
- Unique streaming protocol (e.g., non-SSE, multipart, proprietary framing)
- Provider-specific model info or capability metadata
- Special parameter handling (e.g., tool-call transformation, response_format mapping)
- Rerank, embed, image-generation, or audio endpoints with diverging schemas
How to add a Tier 2 provider: create a directory under src/core/providers/<name>/
containing at minimum mod.rs, then add a variant to ProviderType and implement
the relevant trait methods. Also add the pub mod <name>; declaration in
src/core/providers/mod.rs (guarded by the appropriate feature flag).
If git status shows DU (deleted-by-us, unresolved) files under src/core/providers/:
- Decide the tier using the criteria above.
- Tier 1: delete the directory and add a catalog entry +
mod.rscomment. - Tier 2: restore the directory (
git checkout HEAD -- <path>) and complete the implementation, or add stub methods that returnProviderError::not_implemented. - Verify with
cargo check --all-features— zero DU files means no unresolved paths.
- Adding a Tier 1 provider: add a
def()entry insrc/core/providers/registry/catalog.rs - Adding a Tier 2 provider: create a provider directory in
src/core/providers/<name>/ - New API endpoints: add routes in
src/server/routes/ - Authentication: extend auth modules in
src/auth/ - Configuration: update models in
src/config/models/ - Monitoring: add metrics in respective modules
- MCP servers: add server configs in
src/core/mcp/config.rs - A2A agents: add agent configs in
src/core/a2a/config.rs
Model Context Protocol for connecting LLMs to external tools:
config.rs- Server configuration, authentication (Bearer, API Key, OAuth 2.0)transport.rs- HTTP, SSE, WebSocket, stdio transportsprotocol.rs- JSON-RPC 2.0 implementationtools.rs- Tool definitions and invocationserver.rs- Individual server connection managementgateway.rs- Main gateway aggregating serverspermissions.rs- Fine-grained access control
Agent-to-Agent communication with multi-provider support:
config.rs- Agent configuration, provider typesmessage.rs- JSON-RPC 2.0 message format, task statesprovider.rs- Provider adapters (LangGraph, Vertex AI, Azure, Bedrock, Pydantic AI)registry.rs- Agent discovery and health monitoringgateway.rs- Main gateway for agent management
- Docker build:
make docker - Development stack:
make docker-compose-dev - Production:
make docker-compose - Kubernetes:
make k8s-apply
- Throughput: 10,000+ requests/second
- Latency: <10ms routing overhead
- Memory: ~50MB base footprint
- Architecture: Fully async, connection pooling, zero-copy where possible
This Rust implementation maintains API compatibility with the original Python LiteLLM:
- Core completion API exposed via
src/core/completion.rs - Helper functions:
completion(),user_message(),system_message(),assistant_message() - Unified interface for 100+ providers with automatic routing
When using AI agents (Claude, Codex, Copilot) to create PRs:
- One issue → one branch → one PR. Never bundle unrelated fixes.
- Always branch from latest
main. Never fork from another feature branch. - Max 10 files / 500 lines per PR (excluding Cargo.lock, docs). Use
scripts/guards/check_pr_scope.shto verify. - Run overlap check before pushing:
scripts/guards/check_pr_overlap.shdetects file conflicts with open PRs.
- Parallel agents must use
git worktreefor isolation:git worktree add /tmp/agent-task-{id} -b fix/issue-{id} main - Two agents must never modify the same file concurrently.
cargo fmt --all -- --checkcargo clippy --all-targets --all-features -- -D warningscargo test --all-featuresbash scripts/guards/check_pr_scope.shbash scripts/guards/check_pr_overlap.sh
- Rust version is pinned in
rust-toolchain.toml. CI uses the same version. - Never use
@stablein CI — always reference the pinned version.
The vector-db feature (which includes qdrant-client) fails to build on docs.rs due to its read-only filesystem. The qdrant-client build script attempts to write files during compilation.
Solution:
- In
Cargo.toml, the[package.metadata.docs.rs]section explicitly:- Sets
all-features = falseto prevent docs.rs from using--all-features - Lists specific features excluding
vector-db - This allows documentation to build successfully on docs.rs
- Sets
Testing docs.rs compatibility locally:
env DOCS_RS=1 cargo doc --no-deps --features "postgres sqlite redis s3 metrics tracing websockets analytics"