Skip to content
#

tgi

Here are 11 public repositories matching this topic...

Bench360 is a modular benchmarking suite for local LLM deployments. It offers a full-stack, extensible pipeline to evaluate the latency, throughput, quality, and cost of LLM inference on consumer and enterprise GPUs. Bench360 supports flexible backends, tasks and scenarios, enabling fair and reproducible comparisons for researchers & practitioners.

  • Updated Feb 18, 2026
  • Python

Self-hosted FastAPI gateway exposing OpenAI and Anthropic Messages APIs in front of any open-source LLM runtime (vLLM, Ollama, llama.cpp, TGI, SGLang, LocalAI, LM Studio). Streaming, embeddings, metrics, auth, rate limiting.

  • Updated Apr 22, 2026
  • Python

Improve this page

Add a description, image, and links to the tgi topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the tgi topic, visit your repo's landing page and select "manage topics."

Learn more