Skip to content
#

localllama

Here are 43 public repositories matching this topic...

Auto-tuned launcher for GGUF models on llama.cpp / ik_llama.cpp — OpenAI-compatible server with multi-GPU tensor-split, MoE expert placement, measured flag tuning (AI Tune), hardware-matched HuggingFace downloads, and crash recovery. An Ollama alternative for multi-GPU rigs.

  • Updated Jun 21, 2026
  • Go

Improve this page

Add a description, image, and links to the localllama topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the localllama topic, visit your repo's landing page and select "manage topics."

Learn more