A high-performance AI Gateway written in Rust that provides unified access to 100+ AI providers through OpenAI-compatible APIs.
- System Overview - Complete system architecture and design patterns
- Error System - Unified error handling architecture and patterns
- Provider Implementation - Guide for implementing individual providers
- Architecture Improvements - Historical improvements and optimizations
- Getting Started - Quick start guide and basic usage
- Configuration - Configuration management and environment setup
- Deployment - Production deployment strategies
- Testing - Testing strategies and best practices
- Provider Overview - Supported providers and capabilities
- DeepSeek - DeepSeek V3.1 integration guide
- OpenAI - OpenAI and compatible providers
- Anthropic - Claude models integration
- Adding Providers - Step-by-step provider implementation
- MCP Gateway - Model Context Protocol integration
- A2A Protocol - Agent-to-Agent communication
- Basic Examples - Simple completion examples
- Advanced Features - Streaming, function calling, etc.
- Integration Examples - Web frameworks and service integrations
use litellm_rs::{completion, user_message, system_message};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let response = completion(
"gpt-4",
vec![
system_message("You are a helpful assistant."),
user_message("Hello, how are you?"),
],
None,
).await?;
println!("Response: {}", response.choices[0].message.content);
Ok(())
}- High Performance: Built with Rust and Tokio for maximum throughput (10,000+ req/s)
- OpenAI Compatible: Drop-in replacement for OpenAI API
- 100+ Providers: Unified interface to all major AI providers
- Intelligent Routing: Smart load balancing and failover
- Enterprise Ready: Authentication, monitoring, cost tracking
- Type Safety: Compile-time guarantees and zero-cost abstractions
- MCP Gateway: Model Context Protocol for external tool integration
- A2A Protocol: Agent-to-Agent communication with multi-provider support
Real benchmark results from our unified router (run with cargo bench):
| Operation | Time | Description |
|---|---|---|
| Router Creation | 39.4 ns | Create empty router instance |
| Add Deployment | 1.04 µs | Insert single deployment |
| Alias Resolution | 31.9 ns | Model name alias lookup |
| Record Success | 47.3 ns | Atomic counter update (lock-free) |
| Record Failure | 65.5 ns | Atomic failure counter update |
| Strategy | Time | Use Case |
|---|---|---|
| RoundRobin | 1.24 µs | Equal distribution |
| LatencyBased | 1.81 µs | Lowest latency first |
| SimpleShuffle | 1.85 µs | Random selection |
| LeastBusy | 2.04 µs | Fewest active requests |
| Deployments | Time | Throughput |
|---|---|---|
| 1 | 130 ns | ~7.7M ops/s |
| 5 | 388 ns | ~2.6M ops/s |
| 10 | 694 ns | ~1.4M ops/s |
| 50 | 3.2 µs | ~312K ops/s |
| 100 | 6.3 µs | ~159K ops/s |
| Concurrent Tasks | Time | Throughput |
|---|---|---|
| 10 | 37.3 µs | ~268K ops/s |
| 50 | 97.7 µs | ~512K ops/s |
| 100 | 172 µs | ~581K ops/s |
| 500 | 721 µs | ~693K ops/s |
- Lock-free design: Uses
DashMapand atomic operations for zero-lock concurrent access - Static dispatch: Provider enum avoids vtable overhead
- Nanosecond-level atomic ops: Record success/failure in ~50ns
- Linear scaling: Concurrent throughput scales with task count
- Sub-microsecond routing: Most strategies complete under 2µs
# Run all benchmarks
cargo bench
# Run specific benchmark groups
cargo bench -- unified_router # Router operations
cargo bench -- concurrent_router # Concurrent performance
cargo bench -- cache_operations # Cache benchmarks
# Generate HTML report
cargo bench -- --noplot # Skip plot generation for faster runsBenchmark results are generated using Criterion.rs and saved to target/criterion/.
LiteLLM-RS uses a trait-based provider system that ensures consistency across all AI providers while allowing for provider-specific optimizations.
Sophisticated routing with multiple strategies:
- Round Robin
- Least Latency
- Cost Optimized
- Health-Based
- Custom Weighted
All provider-specific errors are mapped to a unified error system for consistent error handling across the entire system.
- Rust 1.70+
- PostgreSQL (optional)
- Redis (optional)
# Development
make dev # Start development server
cargo test --all-features # Run tests
cargo clippy --all-features # Lint code
# Production
make build # Build release binary
make docker # Build Docker image- Read the Provider Implementation Guide
- Check existing issues
- Follow the development setup
- Submit PRs with tests and documentation
This project is licensed under the MIT License - see the LICENSE file for details.