Skip to content

Latest commit

 

History

History
55 lines (38 loc) · 1.51 KB

File metadata and controls

55 lines (38 loc) · 1.51 KB

autotalk — Hands-Free Voice Interface for Claude Code

Talk to your terminal. Claude Code hears you.

What It Does

Continuous mic capture → voice activity detection → local speech-to-text → inject into Claude Code's terminal input. Fully local, no cloud STT.

Stack

  • Mic capture: sounddevice (PortAudio)
  • VAD: webrtcvad (Google WebRTC, C extension)
  • STT: faster-whisper (CTranslate2, Whisper base.en)
  • Injection: AppleScript keystroke/clipboard into active terminal
  • TTS (output): voxtral-mcp speak tool (Kokoro-82M) — Claude Code calls it directly

Usage

# Start listening (Open Mic mode, paste delivery)
./run.sh

# Use specific mic
./run.sh --device 3

# Dry run — transcribe but don't inject
./run.sh --mode dry-run

# Better accuracy (slower)
./run.sh --model small.en

# Target specific app
./run.sh --target Terminal

Full Duplex Setup

  1. Terminal A: ./run.sh (autotalk listens)
  2. Terminal B: claude (Claude Code running)
  3. Talk → autotalk transcribes → pastes into Claude Code
  4. Claude Code responds → uses voxtral-mcp speak tool to read aloud

Files

  • autotalk.py — main script (mic → VAD → STT → inject)
  • run.sh — launcher (activates venv)
  • test_pipeline.py — component validation
  • .venv/ — Python 3.13 virtual environment

Requirements

  • macOS (AppleScript injection)
  • Python 3.11+
  • Microphone access (grant in System Settings > Privacy > Microphone)
  • Accessibility permission for AppleScript keystroke injection