Local voice-to-clipboard for developers. Tap a key, speak, paste.
Zerm is a native desktop app that turns speech into clean text without sending your voice to a cloud service. It records from your microphone, transcribes with Whisper on your machine, optionally reformats the transcript through your local Ollama model, and writes the result to your clipboard.
It is built for people who use voice as an input method for coding agents, Slack, email, notes, pull request reviews, and long-form writing.
- Features
- Install
- First-run Setup
- Usage
- Privacy And Security
- Build From Source
- Project Structure
- Verification
- Contributing
- Release Process
- Roadmap
- License
- On-device transcription with
whisper.cppthroughwhisper-rs. - Local rewrite modes through Ollama and Gemma 3: Off, Developer, Chat, and Pro.
- Clipboard-first workflow: record, process, copy, and optionally auto-paste on macOS with native key events.
- Hotkey recording: Right Option on macOS; Ctrl+Shift+Space on Windows/Linux.
- Voice activity detection to auto-stop after silence.
- Custom vocabulary for names, project terms, acronyms, and identifiers.
- Private by default history: history starts off and can be enabled explicitly.
- First-run setup UI for Whisper, Ollama, and the local model.
- Cross-platform bundles for macOS, Windows, and Linux through Tauri 2.
Download the latest build from the Releases page or the project website.
| Platform | Package | Current hotkey |
|---|---|---|
| macOS Apple Silicon | .dmg |
Right Option |
| macOS Intel | .dmg |
Right Option |
| Windows | .msi or .exe |
Ctrl+Shift+Space |
| Linux | .deb or .AppImage |
Ctrl+Shift+Space |
macOS releases, including prereleases, are Developer ID signed and notarized so macOS Accessibility and Automation grants stay attached across updates. Windows installers are Authenticode signed. Linux release artifacts are published with SHA-256 checksums instead of platform signing.
The dashboard walks through setup when something is missing:
- Whisper model: downloads the multilingual
ggml-small.binmodel into the app data directory. - Ollama: installs the official local app when needed, or lets users keep an existing Homebrew/custom Ollama service with one clear confirmation. Linux treats existing local Ollama services as unverified unless the user explicitly opts in.
- Gemma 3 4B: pulls the default local rewrite model through Ollama.
macOS also requires Accessibility permission for global modifier-key recording
and auto-paste. Auto-paste is currently macOS-only and sends a native Cmd+V
keyboard sequence to the app that was focused when recording started. Windows
and Linux paste synthesis is not implemented yet. Microphone permission is
requested by the operating system on first use.
- Launch Zerm.
- Press the hotkey to start recording.
- Speak naturally.
- Press the hotkey again or stop talking and let silence detection finish.
- Paste the copied result wherever you were working.
Prompt modes:
| Mode | Output |
|---|---|
| Off | Raw transcript with conservative cleanup |
| Developer | A clear instruction for a coding agent |
| Chat | Short casual message |
| Pro | Polished long-form prose |
Zerm is designed around local processing.
- No accounts.
- No telemetry.
- No hosted transcription service.
- No cloud LLM calls from Zerm.
- Dictation history is off by default.
- Clearing or disabling history also erases the backup state file.
- Local Ollama access is checked before transcripts are sent to
127.0.0.1:11434. macOS and Windows verify the official app/publisher where supported; Linux treats an existing local listener as unverified unless the user explicitly opts in. When a local service exists but cannot be fully verified, Zerm offers a simple choice: install the official app or keep using the existing local Ollama.
First-run setup does make network requests to download required model and installer assets:
| Destination | Purpose |
|---|---|
huggingface.co |
Whisper model download |
api.github.com / github.com |
Ollama release metadata and installer assets |
| Ollama model registry | Gemma model pull through the local Ollama service |
Downloaded setup assets are bounded and hash/signature checked where the app can verify them. Release builds are also checked by CI before publishing.
Prerequisites:
- Node.js 22 or newer
- pnpm 10.33.0 through Corepack
- Rust stable
- Tauri system dependencies for your platform
- CMake
- Ollama, if you want local rewrite modes during development
macOS:
brew install cmake ollama
corepack enable
corepack prepare pnpm@10.33.0 --activate
pnpm install
pnpm tauri devUbuntu 22.04+:
sudo apt-get update
sudo apt-get install -y \
libwebkit2gtk-4.1-dev \
libayatana-appindicator3-dev \
librsvg2-dev \
libgtk-3-dev \
libsoup-3.0-dev \
libjavascriptcoregtk-4.1-dev \
libasound2-dev \
libxdo-dev \
cmake \
build-essential
corepack enable
corepack prepare pnpm@10.33.0 --activate
pnpm install
pnpm tauri devBuild a bundle:
pnpm tauri build| Path | Purpose |
|---|---|
src-tauri/src/lib.rs |
Tauri commands, app lifecycle, setup, recording pipeline |
src-tauri/src/audio.rs |
Microphone capture and audio utilities |
src-tauri/src/whisper.rs |
Whisper model loading and transcription |
src-tauri/src/ollama.rs |
Local Ollama identity checks and rewrite requests |
src-tauri/src/state.rs |
Settings, history, stats, persistence |
dashboard.html |
Main dashboard markup |
src/dashboard.ts |
Dashboard behavior and setup flows |
src/styles.css |
App UI styling |
docs/ |
GitHub Pages landing page |
assets/ |
Repository-facing logo assets |
Run the same checks used by CI:
pnpm typecheck
pnpm build
pnpm audit --prod
cargo fmt --manifest-path src-tauri/Cargo.toml --check
cargo check --manifest-path src-tauri/Cargo.toml --all-targets
cargo test --manifest-path src-tauri/Cargo.toml --lib
cargo clippy --manifest-path src-tauri/Cargo.toml --all-targets --all-features -- -D warnings
cargo audit --file src-tauri/Cargo.lock --deny warnings \
--ignore RUSTSEC-2024-0370 \
--ignore RUSTSEC-2024-0411 \
--ignore RUSTSEC-2024-0412 \
--ignore RUSTSEC-2024-0413 \
--ignore RUSTSEC-2024-0414 \
--ignore RUSTSEC-2024-0415 \
--ignore RUSTSEC-2024-0416 \
--ignore RUSTSEC-2024-0417 \
--ignore RUSTSEC-2024-0418 \
--ignore RUSTSEC-2024-0419 \
--ignore RUSTSEC-2024-0420 \
--ignore RUSTSEC-2024-0429 \
--ignore RUSTSEC-2025-0057 \
--ignore RUSTSEC-2025-0075 \
--ignore RUSTSEC-2025-0080 \
--ignore RUSTSEC-2025-0081 \
--ignore RUSTSEC-2025-0098 \
--ignore RUSTSEC-2025-0100 \
--ignore RUSTSEC-2026-0097cargo fmt, cargo clippy, and cargo audit require rustfmt,
clippy, and cargo-audit to be installed for your Rust toolchain.
rustup component add rustfmt clippy
cargo install cargo-audit --lockedThe RustSec audit line matches the release workflow's current ignore list for known advisories.
Native writing-layer checks for Accessibility, app signing, auto-paste, the full-screen pill, and platform paste support live in docs/native-writing-layer-verification.md. The helper script is read-only by default:
scripts/verify-native-writing-layer.shUse strict mode for release artifacts:
scripts/verify-native-writing-layer.sh --app /Applications/Zerm.app --strict-releaseIssues and pull requests are welcome.
Before opening a PR:
- Keep changes focused and explain the user-facing behavior.
- Add or update tests for persistence, privacy, setup, or platform behavior.
- Run the verification commands above.
- Include screenshots or short recordings for UI changes.
- Note any platform you could not test.
Good first areas:
- Platform-specific hotkey improvements.
- Linux and Windows setup recovery.
- Accessibility and keyboard navigation.
- Additional local prompt modes.
- Documentation for distro-specific Linux dependencies.
Releases are driven by tags.
git tag v0.1.0-alpha.16
git push origin v0.1.0-alpha.16The release workflow runs preflight checks, creates a draft GitHub Release, builds platform artifacts, uploads them, and publishes only after every matrix job succeeds.
Release tags, including prerelease tags such as v0.1.0-alpha.16, require
Apple and Windows signing secrets in the GitHub repository. Linux release
artifacts are published with SHA-256 checksums rather than platform signing.
The website deploys separately from docs/ on pushes to the Production
branch or through manual workflow dispatch.
- Push-to-talk style modifier hooks for Windows and Linux.
- Faster streaming transcription and rewrite feedback.
- Optional encrypted history storage.
- User-defined prompt mode templates.
- Richer release provenance and public checksums.
Zerm is released under the MIT License.
Built by Arcusis.
