Skip to content

ksroga/voice-paste

VoicePaste

Hold a hotkey, speak, release — your words are transcribed and pasted instantly.

Push-to-talk speech-to-text for your desktop. Powered by Groq Whisper API.

Website License: MIT GitHub release Platform

VoicePaste Main Screen

DownloadFeaturesHow it worksConfigurationDevelopment


Download

Go to the latest release and download:

File Platform
VoicePaste-Setup-x.x.x.exe Windows installer (recommended)
VoicePaste-x.x.x-portable.exe Windows portable — no installation needed
VoicePaste-x.x.x.dmg macOS (Intel & Apple Silicon)
VoicePaste-x.x.x.AppImage Linux
VoicePaste-x.x.x.deb Debian / Ubuntu

After installing, you'll need a Groq API key from console.groq.com/keys.

Features

  • Push-to-talk — hold your hotkey, speak, release. Text is transcribed and pasted into any active application
  • Blazing fast — Groq's Whisper Large v3 Turbo processes speech in under a second
  • Multi-language — supports 50+ languages with automatic language detection and code-switching (mix languages naturally)
  • Smart audio processing — built-in noise filtering, compression, and gain boost for clearer transcription
  • Active window context (experimental) — detects your active app (VS Code, terminal, email) and hints the AI for better accuracy with domain-specific terms
  • Audio ducking — automatically lowers system volume while you speak
  • History — log of all transcriptions with full API metadata (processing time, detected language, audio duration)
  • Clipboard management — optionally restores your previous clipboard content after pasting
  • System tray — runs quietly in the background, always ready
  • Auto-start — launches with your operating system
  • 8 UI languages — English, Polish, German, Spanish, French, Chinese, Japanese, Portuguese
  • Settings export/import — backup and restore your configuration as JSON
  • Auto-updates — checks for new versions automatically via GitHub Releases
  • Cross-platform — Windows, macOS, and Linux
  • Privacy-first — your API key is encrypted locally. Audio is sent only to Groq's API, never stored remotely

How it works

Hold hotkey → Speak → Release hotkey → Text appears where your cursor is
  1. Press and hold your configured hotkey (default: Ctrl + Win on Windows, Ctrl + Cmd on macOS)
  2. Speak naturally — an overlay shows recording status
  3. Release the hotkey — the overlay switches to "Transcribing..."
  4. In under a second, the transcribed text is pasted at your cursor position

VoicePaste captures audio from your microphone, processes it through a real-time Web Audio API pipeline (high-pass filter, compressor, gain boost), sends it to Groq's Whisper API, and simulates paste to insert the result.

Configuration

On first launch, VoicePaste guides you through setup:

Setting Description
Groq API Key Required. Get one at console.groq.com/keys
Hotkey Key combination to hold while speaking (default: Ctrl + Win / Ctrl + Cmd)
Microphone Select your audio input device — with live level indicator
Primary language Your main language for transcription
Spoken languages All languages you use — enables auto-detection when multiple are selected
Audio enhancement Toggle noise filtering and volume boost (on by default)
Keep clipboard Whether to leave transcribed text in clipboard after pasting
Active window context (Experimental) Send active window title to improve transcription accuracy
Launch at startup Start VoicePaste with your OS
UI language Interface language (8 available)

All settings can be exported/imported as JSON for backup or sharing across machines.

Tech stack


Development

Prerequisites

  • Node.js >= 20
  • Windows, macOS, or Linux

Setup

git clone https://github.com/ksroga/voice-paste.git
cd voice-paste
npm install

Development mode

# Run the app in dev mode (hot reload)
npm run rebuild:electron
npm run dev

# Linting, type checking, unit tests (no GUI needed)
npm run lint
npm run typecheck
npm run test:unit

Note: Native modules (uiohook-napi, better-sqlite3) compile per platform. Run npm run rebuild:electron after npm install to recompile for Electron's Node.js version.

Project structure

src/
├── main/                  # Electron main process
│   ├── index.ts           # App lifecycle, IPC handlers, tray, hotkey flow
│   ├── hotkey.ts          # Global hotkey detection (uiohook-napi)
│   ├── groq.ts            # Groq Whisper API client
│   ├── paste.ts           # Clipboard + paste simulation (Windows/macOS/Linux)
│   ├── db.ts              # SQLite history storage
│   ├── config.ts          # Settings (JSON file + safeStorage encryption)
│   ├── overlay.ts         # Recording/transcribing overlay window
│   ├── sounds.ts          # Start/stop sound effects (cross-platform)
│   ├── audio-control.ts   # System volume ducking
│   ├── active-window.ts   # Active window detection (Windows/macOS/Linux)
│   └── updater.ts         # Auto-update via GitHub Releases
├── preload/
│   └── index.ts           # IPC bridge (contextBridge)
└── renderer/
    ├── src/
    │   ├── App.vue        # Root component + audio recording pipeline
    │   ├── components/    # HistoryView, SettingsView, OnboardingView, HotkeyRecorder
    │   ├── i18n/          # Translations (8 languages)
    │   └── languages.ts   # Whisper language options
    └── overlay.html       # Recording overlay (standalone)

Scripts

Command Description
npm run dev Start in development mode (hot reload)
npm run build Build for production
npm run package Package for Windows (installer + portable)
npm run package:mac Package for macOS (dmg)
npm run package:linux Package for Linux (AppImage + deb)
npm run package:all Package for all platforms
npm run lint Run ESLint
npm run typecheck Run TypeScript type checking
npm run test:unit Run unit tests (Vitest)

Building

# Build + package for your current platform
npm run build
npm run package          # Windows
npm run package:mac      # macOS (must run on macOS)
npm run package:linux    # Linux

Output goes to dist/.

Releasing

Releases are automated via GitHub Actions. To publish a new version:

# 1. Update version in package.json
# 2. Update CHANGELOG.md
# 3. Commit and tag
git add -A
git commit -m "Release v0.1.0"
git tag v0.1.0
git push origin main --tags

GitHub Actions will build on Windows, macOS, and Linux, then publish a GitHub Release with all binaries.


Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/my-feature)
  3. Commit your changes
  4. Push to the branch (git push origin feature/my-feature)
  5. Open a Pull Request

For bugs or feature requests, please open an issue.

License

MIT © 2026 Konrad Sroga

About

Open-source voice-to-text for power users. Hold a hotkey, speak, text pastes instantly. Powered by Groq + Whisper. No subscription, bring your own API key.

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Contributors