█░░░█ █░░░█ ░███░ ░████ ████░ █████ ████░ █░░░█ █████ █░░░█ █░░░█ █░░░█ ░░█░░ █░░░░ █░░░█ █░░░░ █░░░█ █░█░░ █░░░░ █░░░█ █░█░█ █████ ░░█░░ ░███░ ████░ ████░ ████░ ███░░ ████░ ░███░ █░█░█ █░░░█ ░░█░░ ░░░░█ █░░░░ █░░░░ █░█░░ █░█░░ █░░░░ ░░█░░ ░███░ █░░░█ ░███░ ████░ █░░░░ █████ █░░█░ █░░░█ █████ ░░█░░
Hold a key to speak. Release to transcribe.
Local voice input for macOS — offline, free, no subscriptions. Optional on-device or cloud post-processing to clean up your speech into polished writing.
📖 查看简体中文文档
Most macOS dictation tools are either online-only or expensive:
| WhisperKey | SuperWhisper | Wispr Flow | macOS Dictation | |
|---|---|---|---|---|
| Free & open source | ✅ | ❌ ($250 lifetime) | ❌ ($15/mo) | ✅ |
| Fully offline STT | ✅ | ✅ | ❌ | ❌ |
| Chinese/English mixed | ✅ | ✅ | ✅ | |
| Voice cleanup (filler removal, re-writing) | ✅ | ✅ | ✅ | ❌ |
| Custom word replacements | ✅ | ❌ | ||
| Token usage dashboard | ✅ | ❌ | ❌ | ❌ |
| Customizable hotkeys | ✅ | ✅ | ❌ | ❌ |
Direct .app download |
✅ | ✅ | ✅ | — |
WhisperKey keeps transcription on your Mac using faster-whisper. The core dictation flow stays local-first; optional OpenAI-powered cleanup and correction can be enabled with your own API key.
- Hold-to-talk — hold Right Option ⌥ to record, release to transcribe
- Hands-free mode — Right Option ⌥ + Right Command ⌘ toggles continuous recording
- 90+ languages with Chinese/English mixed handling
- Fully offline STT via faster-whisper — no internet after the first model download
- Auto-paste directly into the active app
- VoiceInput pill overlay — compact, unobtrusive visual feedback for recording, transcribing, and result states
- Voice Cleanup — removes "um", "uh", fillers, repetition, and rewrites rambling speech into clean prose
- ASR Correction — fixes homophones, punctuation, and obvious transcription errors on short texts
- Custom prompt — bring your own instruction for domain-specific processing
- Output Language — keep original, translate to English, or translate to Chinese after processing
- Uses your own OpenAI API key, stored in macOS Keychain (never committed)
- Settings GUI with 5 tabs: General, Voice, Word Fix, Usage, Advanced
- Menu bar app — at-a-glance status, pause/resume service, quick Settings access
- Word Replacements dictionary — map
cloude → Claude,gpt → GPT, and similar corrections automatically - Token usage dashboard — track OpenAI consumption (today / this week / all time) and disk footprint
- Microphone picker — select any connected input device
- Fully customizable hotkeys — hold key and hands-free combo
- Launch at Login toggle (managed via macOS LaunchAgent)
- Bilingual UI (zh / en) throughout setup, Settings, and menu bar
- Graceful fallback — if the cloud request fails or times out, raw transcript is pasted instead
- macOS 12 Monterey or later (Apple Silicon recommended)
- Python 3.10+ (if installing from source; not needed for the packaged
.app) - Microphone
- System permissions: Input Monitoring + Accessibility
- (Optional) OpenAI API key for post-processing
Grab WhisperKey-macOS-arm64-v3.0.1.zip from the Releases page, unzip, and move WhisperKey.app to /Applications.
On first launch, grant the two macOS permissions:
- Input Monitoring — lets WhisperKey detect the hotkey
- Accessibility — lets WhisperKey paste text into the active app
This build is locally signed but not notarized by Apple. If macOS blocks the first launch, right-click WhisperKey.app → Open → confirm.
The first transcription downloads the selected Whisper model from HuggingFace (internet required once). After that, transcription runs fully offline.
pip install git+https://github.com/Phat-Po/whisperkey-mac.gitOr clone for development:
git clone https://github.com/Phat-Po/whisperkey-mac.git
cd whisperkey-mac
python3 -m venv .venv && source .venv/bin/activate
pip install -e .macOS tip: if
python3 -Vis below 3.10, use a Homebrew Python explicitly:python3.12 -m venv .venv.
Cache behavior: reinstalling or rebuilding a venv does not re-download an already cached model. Model files under
~/.cache/huggingface/hubare reused unless deleted manually.
whisperkeyAn interactive setup wizard guides you through:
- UI language — English or 中文
- Transcription language — English / Chinese / Mixed / Other
- Whisper model — base / small / large-v3-turbo
- Hotkeys — defaults or your own
- System permissions — guided walkthrough
- AI post-processing (optional) — pick a mode and save your OpenAI API key to Keychain
WhisperKey runs in the background as a menu bar app — no window needed.
| Action | Hotkey |
|---|---|
| Start recording | Hold Right Option ⌥ |
| Stop and transcribe | Release Right Option ⌥ |
| Toggle hands-free mode | Right Option ⌥ + Right Command ⌘ |
After local transcription, WhisperKey can optionally pipe the result through OpenAI for cleanup. Three modes are available in Settings → Voice → Processing Mode:
| Mode | What it does | Best for | Recommended timeout |
|---|---|---|---|
| Disabled | Pastes raw Whisper output | Fastest; no cloud calls | — |
| ASR Correction | Fixes homophones, missing punctuation, obvious transcription errors. Minimal rewriting. | Short phrases, command-style input, technical terms | 3 sec |
| Voice Cleanup ⭐ | Removes filler words (um / uh / 就是 / 那個), deduplicates hesitation, reorganizes rambling thoughts into clean prose. Preserves all specifics (numbers, names, constraints). | Longer messages, notes, drafting emails / docs | 8 sec |
| Custom | Runs your own system prompt | Domain-specific rewriting (formal, code, translation styles) | 8 sec |
All modes gracefully fall back to the raw transcript on timeout or API error.
WhisperKey lives in the macOS menu bar. Click the icon to access:
- Status line — running / paused / waiting for permissions
- Pause / Resume — temporarily stop hotkey listening without quitting (handy for games or screen recording)
- Settings… — opens the full Settings GUI
- Quit WhisperKey
The menu bar title updates live based on service state.
Open via Menu bar → Settings… — five tabs cover everything:
- Interface Language (zh / en)
- Transcription Language (Auto / zh / en / other ISO code)
- Output Language (match input / translate to English / translate to Chinese)
- Whisper Model (
base/small/large-v3-turbo) - Microphone — pick any connected input device (or system default)
- Launch at Login toggle
- Processing Mode (Disabled / ASR Correction / Voice Cleanup / Custom)
- Online Model (e.g.
gpt-5.4— customizable) - Timeout in seconds (recommended: 8 for Voice Cleanup, 3 for ASR Correction)
A personal dictionary that post-processes every transcript. Useful for brand names the STT model consistently mishears.
cloude → Claude
cloud ai → Claude AI
open ei eye → OpenAI
- One replacement per line
- Use
→or-> - Case-insensitive, longest match wins
- Runs locally; no cloud call needed
Live dashboard showing:
- OpenAI token consumption (input / output, today / this week / all time)
- Disk footprint — audio temp files + Whisper model cache paths
- Hold Key — any pynput key name (e.g.
alt_r,cmd_r,f13) - Handsfree Keys — comma-separated combo (e.g.
alt_r, cmd_r) - API Key — paste a new OpenAI key; stored in macOS Keychain
The Usage tab gives you transparent visibility into your OpenAI spend:
- Per-day, per-week, and lifetime input/output token counts
- Disk usage for audio temp files (
/tmp/whisperkey_mac/) - Disk usage for the Whisper model cache (
~/.cache/huggingface/hub/) - Refresh button for live updates
No analytics are sent anywhere — everything is read from local logs.
For advanced or scripted setups, config is stored at ~/.config/whisperkey/config.json:
{
"ui_language": "en",
"transcribe_language": "auto",
"output_language": "auto",
"model_size": "small",
"input_device": "",
"hold_key": "alt_r",
"handsfree_keys": ["alt_r", "cmd_r"],
"auto_paste": true,
"result_max_lines": 3,
"online_prompt_mode": "disabled",
"online_correct_enabled": false,
"online_correct_provider": "openai",
"online_correct_model": "gpt-5.4",
"online_correct_timeout_s": 8.0,
"online_prompt_custom_text": "",
"word_replacements": {},
"launch_at_login": false
}Useful for LaunchAgents and CI:
| Variable | Overrides |
|---|---|
OPENAI_API_KEY |
Keychain-stored API key |
WHISPERKEY_MODEL |
model_size |
WHISPERKEY_COMPUTE_TYPE |
compute_type (default int8) |
WHISPERKEY_DEVICE |
device (default cpu) |
WHISPERKEY_LANGUAGE |
Whisper language hint |
WHISPERKEY_SAMPLE_RATE |
Recording sample rate |
WHISPERKEY_AUTO_PASTE |
1 / 0 |
WHISPERKEY_RESULT_MAX_LINES |
HUD line cap |
WHISPERKEY_ONLINE_CORRECT |
1 / 0 |
WHISPERKEY_ONLINE_CORRECT_MODEL |
OpenAI model name |
WHISPERKEY_ONLINE_PROMPT_MODE |
disabled / asr_correction / voice_cleanup / custom |
| Model | Size | Best for |
|---|---|---|
base |
~141 MB | Low-end devices, speed priority |
small |
~464 MB | Recommended ⭐ Balanced speed and accuracy |
large-v3-turbo |
~1.5 GB | Highest accuracy |
WhisperKey requires two macOS system permissions:
1. Input Monitoring — to detect your hotkeys → System Settings → Privacy & Security → Input Monitoring
2. Accessibility — to paste transcribed text into the active app → System Settings → Privacy & Security → Accessibility
Add the app printed by whisperkey permissions or whisperkey help to both lists and enable the toggle.
For source installs this is usually Python.app:
/opt/homebrew/Cellar/python@3.xx/x.x.x/Frameworks/Python.framework/Versions/3.xx/Resources/Python.app
For the packaged build, authorize WhisperKey.app.
Note: each packaged build has a different CDHash, so after upgrading the
.appyou must re-authorize both permissions.
whisperkey helpAutomatically checks: process status · Accessibility · Input Monitoring · audio devices · model files · config
| Symptom | Fix |
|---|---|
| No response to hotkeys | Check Input Monitoring permission |
| Hands-free hotkey does not respond | Make sure only /Applications/WhisperKey.app is running, then check Input Monitoring + Accessibility |
| Transcription not pasting | Check Accessibility permission |
| Post-processing not applying | Re-run whisperkey setup or set OPENAI_API_KEY; check Settings → Voice → Processing Mode |
inject_path=applescript in logs |
Expected for Electron/web chat apps; it's the compatibility paste path |
Upgraded .app stopped working |
Re-authorize Input Monitoring + Accessibility (CDHash changed) |
tail -f /tmp/whisperkey.log # live logs
launchctl kickstart -k gui/$(id -u)/com.whisperkey # restart service🚀 Auto-start on Login (LaunchAgent setup)
The Settings GUI Launch at Login toggle manages this automatically. Manual setup (for source installs) is below:
# 1. Install locally (not on an external drive)
mkdir -p ~/Library/Application\ Support/whisperkey
python3 -m venv ~/Library/Application\ Support/whisperkey/venv
~/Library/Application\ Support/whisperkey/venv/bin/pip install git+https://github.com/Phat-Po/whisperkey-mac.git
# 2. Create LaunchAgent
cat > ~/Library/LaunchAgents/com.whisperkey.plist << 'EOF'
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.whisperkey</string>
<key>ProgramArguments</key>
<array>
<string>/Users/YOUR_USERNAME/Library/Application Support/whisperkey/venv/bin/python</string>
<string>-m</string>
<string>whisperkey_mac.supervisor</string>
</array>
<key>EnvironmentVariables</key>
<dict>
<key>WHISPERKEY_MODEL</key>
<string>small</string>
<key>PYTHONUNBUFFERED</key>
<string>1</string>
</dict>
<key>KeepAlive</key>
<false/>
<key>RunAtLoad</key>
<true/>
<key>LimitLoadToSessionType</key>
<string>Aqua</string>
<key>WorkingDirectory</key>
<string>/Users/YOUR_USERNAME/Library/Application Support/whisperkey</string>
<key>StandardOutPath</key>
<string>/tmp/whisperkey.log</string>
<key>StandardErrorPath</key>
<string>/tmp/whisperkey.log</string>
</dict>
</plist>
EOF
# Replace YOUR_USERNAME with your actual username
# 3. Register the service
launchctl bootstrap gui/$(id -u) ~/Library/LaunchAgents/com.whisperkey.plistThe LaunchAgent starts a crash supervisor, which launches the app, writes crash details to /tmp/whisperkey-last-crash.log, and sends a macOS notification on unexpected exit.
🛠️ Development
git clone https://github.com/Phat-Po/whisperkey-mac.git
cd whisperkey-mac
python3 -m venv .venv && source .venv/bin/activate
pip install -e .
whisperkey # run
whisperkey setup # reconfigure
whisperkey help # troubleshootwhisperkey_mac/
├── main.py # Entry point, CLI routing
├── app_entry.py # Menu bar app bootstrap
├── menu_bar.py # Menu bar item + state sync
├── settings_window.py # Settings GUI (5 tabs)
├── config.py # Config loading/saving (JSON + env vars)
├── i18n.py # zh/en string dictionary
├── keyboard_listener.py # Hold-key + hands-free hotkey logic
├── audio.py # Audio recording (sounddevice)
├── transcriber.py # Whisper STT (faster-whisper)
├── online_correct.py # Optional OpenAI post-processing pipeline
├── keychain.py # macOS Keychain helpers for OpenAI API key
├── output.py # Text injection (clipboard + focused-app paste)
├── overlay.py # VoiceInput pill overlay
├── usage_log.py # Token consumption tracking
├── launch_agent.py # LaunchAgent install/uninstall helpers
├── setup_wizard.py # Interactive terminal setup
└── help_cmd.py # Troubleshooter
Packaging: packaging/macos/build_app.sh (PyInstaller + codesign) → packaging/macos/package_release.sh (zip for Releases).
Hotkey diagnostics: scripts/debug_raw_cgevent_tap.py logs raw macOS key events and helps verify whether a modifier+character combo reaches CGEventTap before pynput translates it.
MIT © 2026 Phat-Po
Built with faster-whisper · pynput · sounddevice · PyObjC
If this project helps you, consider giving it a ⭐ Star!