Give your Anki Vector robot a local AI soul using Gemma 4 + Ollama. No cloud, no subscriptions, no token costs.
Vector sees through its camera → Gemma 4 thinks locally on your Mac → Vector speaks its response aloud.
┌─────────────┐ WiFi ┌──────────────────────┐
│ Anki Vector │◄────────────►│ Your Mac (M-series) │
│ │ SDK/gRPC │ │
│ - HD Camera │ │ Ollama │
│ - Speaker │ │ └─ gemma4:e4b (4B) │
│ - LCD face │ │ │
└─────────────┘ └──────────────────────┘
- Python 3.8+
- Ollama installed and running
- Anki Vector robot on the same WiFi
- (Optional) Escape Pod to run Vector without DDL cloud
# 1. Clone the repo
git clone https://github.com/chatura-dev/vector-gemma4
cd vector-gemma4
# 2. Install dependencies
pip install -r requirements.txt
# 3. Pull Gemma 4 vision model
ollama pull gemma4:e4b
# 4. Configure
cp .env.example .env
# Edit .env — set VECTOR_SERIAL if needed (leave blank for auto-detect)
# 5. Pre-warm the model (first run only)
bash scripts/warmup_ollama.sh
# NOTE: First run compiles Metal GPU shaders — takes 5-10 min on macOS.
# After that, startup is instant (shaders are cached).
# 6. Test your connection
python scripts/test_connection.py
# 7. Run!
python -m vector_gemma4macOS first-run note: Ollama 0.19+ compiles Metal GPU shaders the first time a model runs. This is a one-time ~5-10 minute wait. Run
bash scripts/warmup_ollama.shonce and all subsequent starts are instant. Requires Ollama 0.19.0+ (ollama --version). SetOLLAMA_LOAD_TIMEOUT=30mif the default timeout is too short.
Vector periodically looks around and speaks when it notices something new.
python -m vector_gemma4python examples/basic_chat.pypython examples/object_narrator.pypython examples/security_cam.pyEdit the flashcards in examples/study_buddy.py, then:
python examples/study_buddy.pyCopy .env.example to .env and edit:
| Variable | Default | Description |
|---|---|---|
VECTOR_SERIAL |
(auto-detect) | Your Vector's serial number |
OLLAMA_URL |
http://localhost:11434 |
Ollama API endpoint |
MODEL |
gemma4:e4b |
Vision model — e4b is fast, gemma4 is full 26B |
OBSERVE_INTERVAL |
20 |
Seconds between observations |
MAX_HISTORY |
10 |
Conversation turns to remember |
ENABLE_TOOLS |
false |
Enable Gemma 4 function calling |
SYSTEM_PROMPT_FILE |
config/system_prompts/default.txt |
Vector's personality |
| Model | RAM Needed | Speed |
|---|---|---|
gemma4:e4b |
~4GB | Fast (recommended) |
gemma4 (26B MoE) |
~20GB | Slower, smarter |
M-series Macs handle both well. M4 24GB runs the full 26B comfortably.
DDL's Escape Pod lets Vector run without their cloud servers — important for fully offline use. If you have Escape Pod, the SDK will automatically use your local server.
Gemma 4 supports native tool use. Set ENABLE_TOOLS=true in .env and define tools in config/tools_schema.json. Built-in tools: weather, timer, time.
To add your own tool:
- Add the function schema to
config/tools_schema.json - Implement the function in
tools/ - Register it in
vector_gemma4/bridge.py:_handle_tool_calls()
from vector_gemma4 import VectorGemmaBridge, Config
config = Config()
bridge = VectorGemmaBridge(config)
bridge.connect()
# Look and describe what Vector sees
response = bridge.look_and_describe()
bridge.speak(response)
# Text chat
response = bridge.chat("What time is it?")
bridge.speak(response)
# Run the companion loop forever
bridge.run_companion_loop()Post Vector's status to a Discord channel:
- Set
DISCORD_WEBHOOK_URLin.env - Set
DISCORD_ENABLED=true
- Fork the repo
- Create a feature branch (
git checkout -b feat/my-tool) - Commit your changes
- Open a PR
PRs welcome for: new tools, new example modes, improved TTS chunking, wake-word support.
Apache 2.0 — see LICENSE