EdgeMemory is a memory-enhanced AI agent built on LangGraph, designed to run entirely on local hardware. It integrates multiple memory backends (Kuzu graph DB + LanceDB vector store) with on-device LLM inference (llama.cpp) and LoRA fine-tuning for truly personalized AI experiences.
- π‘οΈ Fully Local β No cloud dependency. Your conversations, memories, and data stay on your machine.
- π§ Deep Memory β Combines episodic events, semantic entities, preferences, procedures, and encrypted secrets across KuzuDB (graph) and LanceDB (vector).
- π₯ Hot-Swap LoRA β Load/unload fine-tuned LoRA adapters at runtime without restarting the agent.
- π LangGraph Orchestration β Modular graph nodes for intent routing, memory read/write, RAG retrieval, planning, and answer merging.
- π Web UI + CLI β Built-in web dashboard (FastAPI + static HTML) and CLI for local interaction.
- π Extensible β Provider-agnostic LLM client; swap in any OpenAI-compatible API or local model.
User Input
β
βΌ
βββββββββββββββββββββββββββββββ
β LangGraph Orchestration β
β βββββββββββ ββββββββββββ β
β β Router βββ Intent β β
β βββββββββββ ββββββββββββ β
β β β β
β βββββββββββ ββββββββββββ β
β β Memory β β Planner β β
β β R/W β β β β
β βββββββββββ ββββββββββββ β
β β β β
β βββββββββββββββ ββββββββββ
β β RAG Retrieveβ βAnswer ββ
β βββββββββββββββ ββββββββββ
βββββββββββββββββββββββββββββββ
β β
βΌ βΌ
ββββββββββ ββββββββββββ
β KuzuDB β β LanceDB β
β(Graph) β β (Vector) β
ββββββββββ ββββββββββββ
β
βΌ
ββββββββββββββ
β llama.cpp β
β + LoRA β
ββββββββββββββ
| Backend | Type | Purpose |
|---|---|---|
| KuzuDB | Graph Database | Entities, relationships, events, procedures, preferences |
| LanceDB | Vector Store | Semantic search, document RAG, embedding retrieval |
| SQLite | Relational | Chat history, fine-tune jobs, checkpoints |
- Core Profile β User identity, basic facts (key-value)
- Episodic Events β Time-stamped events with participants and topics
- Semantic Entities β People, places, things with typed relationships
- Preferences β User likes/dislikes with confidence scoring
- Procedures β Multi-step process templates
- Encrypted Secrets β AES-GCM encrypted sensitive information with double-confirmation reveal flow
- Python 3.10+
- CUDA-capable GPU (recommended) or CPU with 8GB+ RAM
- Windows, Linux, or macOS
# Clone the repository
git clone https://github.com/YOUR_USERNAME/EdgeMemory.git
cd EdgeMemory
# Create virtual environment
python -m venv .venv
# Windows
.\.venv\Scripts\activate
# Linux/macOS
source .venv/bin/activate
# Install dependencies
pip install -r requirements.txt- Download a GGUF model (e.g., Qwen2.5-7B-Instruct-Q4_K_M.gguf) and place in
models/ - (Optional) Download embedding model for vector search
Copy and edit the sample config:
cp edge_agent.yaml.sample edge_agent.yamlEdit edge_agent.yaml to set model paths, ports, and memory settings.
# Full application (Web UI + Qt + API)
python edge_agent_main.py
# CLI-only mode
python core/src/main.py
# Build LanceDB index
python core/src/build_index.pyAccess the web UI at http://localhost:8080.
EdgeMemory/
βββ core/src/ # Main source code
β βββ nodes/ # LangGraph nodes (router, intent, planner, memory, answer...)
β βββ rag/ # LanceDB retriever
β βββ memory/ # KuzuDB ingest, snapshot, contract
β βββ llm/ # Local LLM client (llama.cpp wrapper)
β βββ vector_store/ # LanceDB store + index builder
β βββ db/ # Chat logger, SQLite persistence
β βββ workers/ # Async tasks (embedding, ingest, finetune, summarize)
β βββ tools/ # CLI utilities (clean, sync)
β βββ utils/ # Config, paths, watchdog, LoRA state, resource guard
βββ apps/ # Qt desktop client
βββ frontend/ # Web frontend assets
βββ TTS/ # Text-to-speech integration
βββ tools/ # Additional tooling
βββ edge_agent_main.py # Main entry point
βββ edge_agent.yaml # Configuration
Key settings in edge_agent.yaml:
# LLM server settings
llama_server:
host: localhost
port: 8080
model_path: models/your-model.gguf
# Embedding server
embedding:
host: localhost
port: 8081
model_path: models/embedding-model.gguf
# Memory backends
memory:
kuzu_path: artifacts/kuzu_db
lancedb_path: artifacts/lancedb
# LoRA
lora:
enabled: false
adapter_path: models/lora-adapter.gguf- LangGraph β Agent orchestration framework
- llama.cpp β Local LLM inference (via llama-server)
- KuzuDB β Embedded graph database for structured memory
- LanceDB β Embedded vector database for semantic search
- FastAPI β REST API + WebSocket server
- PySide6 β Qt desktop client (optional)
- Docker Compose one-click deployment
- Multi-user session support
- Memory visualization dashboard
- Plugin system for custom memory types
- Remote LLM provider support (OpenAI-compatible API)
See CONTRIBUTING.md for guidelines.
Apache 2.0 β See LICENSE for details.