dictate¶

Privacy-first macOS voice dictation. Local Whisper on-device, with optional LLM cleanup through Ollama (local) or OpenRouter (cloud). No telemetry, no account, no network calls beyond first-run model downloads.

Local-by-default MIT licensed No telemetry

Status

v0.1.0 initial public release. AI code review (sebastionAI) and Bandit run on every PR. OSSF Scorecard runs on push to main and on a weekly schedule. Branch protection requires code-owner review. See the Roadmap and Changelog.

Why dictate?¶

Local by default. On-device Whisper ASR. The default cleanup pipeline ships with the LLM disabled: you get raw transcription + smart punctuation, no network calls after the first-run model download. Flip the toggle in the WebUI any time.
Two opt-in backends. When you want LLM cleanup, choose:
- Ollama: runs entirely on your Mac. Auto-picks the best installed model (≥3B), falls back if your configured one isn't pulled.
- OpenRouter: single key, hundreds of models. Use only if you want to.
Hotkey driven. Hold, tap or double-tap your chosen key. Transcripts are pasted into the focused app via the system pasteboard.
Developer ergonomics. Per-app vocab presets, voice commands, code- grammar mode, automatic secret redaction (API keys, tokens, AWS keys).
Built-in WebUI at http://127.0.0.1:47843, dashboard with sparkline, stats with percentile latencies, full history with search/export, settings with one-click toggles. Loopback only, CSRF protected, dark-mode aware.
Open source, hardened. MIT licensed, pinned-SHA GitHub Actions, AI code review on every PR, OSSF Scorecard, no telemetry. Read the code.

Quick start¶

git clone https://github.com/lewiswigmore/macOS-dictate.git ~/dictate
cd ~/dictate
./install.sh
dictate

Open the WebUI dashboard at http://127.0.0.1:47843 once the menu-bar app is running. See Install for full setup, Permissions for the macOS perms you'll grant (Accessibility, Microphone, Input Monitoring) and First run for the onboarding wizard.

What's in the box¶

Surface	What it does
Menu-bar app	Hotkey state machine, HUD, recorder, ASR, paste insertion
WebUI	Dashboard, stats, history, settings, all loopback-only
Voice commands	In-utterance editing: `period`, `new line`, `delete that`, etc.
Presets	Per-app vocab (`code`, `chat`, `prose`) auto-applied by frontmost app
Doctor	`dictate doctor` health-checks models, permissions, backends
CLI	`dictate`, `dictate doctor`, `dictate restart`, `dictate-web`

What it isn't¶

Not a meeting transcriber. Built for short, hotkey-triggered insertion.
Not a replacement for Apple's Dictation if basic speech-to-text is enough and you trust their servers.
Not a marketplace app. dictate is self-hosted by design. Clone, install, run. If you want a packaged .app for personal use, see Build .app.

Architecture at a glance¶

flowchart LR
    Hotkey --> Recorder --> VAD --> Whisper
    Whisper --> Commands{"Voice cmd?"}
    Commands -- yes --> Typer
    Commands -- no --> Redact --> Cleanup["Cleanup (off by default)"]
    Cleanup --> Typer --> App["Focused app"]
    Whisper --> History
    History --> WebUI["WebUI dashboard / stats / history"]
    History --> Learn

See the full architecture diagram and threat model.

Security & supply chain¶

All GitHub Actions pinned to commit SHAs, hardened runner egress audit.
AI code review (sebastionAI), Bandit and pip-audit run on every PR. OSSF Scorecard runs on push to main and on a weekly schedule.
Secret-scanning + push protection enabled.
Branch protection: required code-owner review, dismiss stale, linear history, conversation resolution, enforce_admins: true.
Loopback-only WebUI with custom-header CSRF defence (X-Dictate-WebUI: 1 on mutating requests), strict CSP (frame-ancestors 'none'), per-request DICTATION prompt-fence nonce, symlink-aware chmod on history file.

Report security issues via GitHub Security Advisories, not public issues.