diff --git a/README.md b/README.md index a6cc5f5..d38968b 100644 --- a/README.md +++ b/README.md @@ -19,21 +19,32 @@ A lightweight, local push-to-talk dictation app for macOS using OpenAI's Whisper ## Installation -### Prerequisites +### Swift Version (Current - Recommended) + +The Swift version is faster, more native, and fully self-contained. No ffmpeg required! + +**Prerequisites:** - macOS 13.0+ (tested on macOS 15+) +- Xcode Command Line Tools (for Swift compiler) +- [uv](https://github.com/astral-sh/uv) package manager -### Swift Version (Current) +1. **Install prerequisites**: +```bash +# Install uv if you don't have it +brew install uv -The Swift version is faster, more native, and fully self-contained. No ffmpeg required! +# Install Python 3.13 (required for bundling mlx-whisper) +uv python install 3.13 +``` -1. **Clone and build**: +2. **Clone and build**: ```bash git clone https://github.com/sayhar/dictation-app.git cd dictation-app ./build-swift.sh ``` -2. **Install the app**: +3. **Install the app**: ```bash cp -R "dist/Swift Dictation.app" ~/Applications/ open ~/Applications/"Swift Dictation.app" @@ -41,9 +52,10 @@ open ~/Applications/"Swift Dictation.app" ### Python Version (Legacy) -The original Python implementation. +The original Python implementation. Use the Swift version above for better performance. **Prerequisites:** +- macOS 13.0+ - Python 3.12+ - [uv](https://github.com/astral-sh/uv) package manager - ffmpeg @@ -97,7 +109,7 @@ If the app doesn't request permissions automatically: | Medium | ~1.5GB | Slower | Very Good | | Large | ~3GB | Slowest | Best | -Models are automatically downloaded to `~/.cache/whisper/` on first use. +Models are automatically downloaded to `~/.cache/huggingface/` on first use. ## Technical Details @@ -123,8 +135,21 @@ Standard Python keyboard libraries (like pynput) don't work properly in bundled ## Files +### Swift Version +- `Dictation/` - Swift source code directory + - `AppDelegate.swift` - Main app, menu bar UI, flow coordination + - `KeyboardMonitor.swift` - Right Command key detection + - `AudioRecorder.swift` - Audio recording via AVFoundation + - `TranscriptionService.swift` - Whisper transcription via Python subprocess + - `TextInjector.swift` - Clipboard and text injection handling +- `build-swift.sh` - Build script for creating the app bundle +- `Package.swift` - Swift Package Manager configuration + +### Python Version (Legacy) - `dictation.py` - Main application code - `setup.py` - py2app build configuration + +### Shared - `create_icon.py` - Icon generation script - `~/Library/Logs/Dictation.log` - Debug logs - `~/Library/Logs/Dictation_Transcripts.log` - Long transcriptions (>30s) @@ -140,6 +165,11 @@ System Settings → General → Login Items → Add Dictation.app - Quit the app completely - Re-launch and grant permissions fresh +**Permissions need to be re-granted after rebuild:** +- This is expected behavior. Each rebuild changes the app's code signature +- macOS requires re-granting Accessibility and Microphone permissions when the signature changes +- Simply approve the permission prompts when they appear + **Permissions show as "uv" or "Python":** - This is normal when running via `uv run` - Build with py2app for proper app attribution @@ -149,6 +179,15 @@ System Settings → General → Login Items → Add Dictation.app - Try removing and re-adding the app to permissions - Check logs: `tail -f ~/Library/Logs/Dictation.log` +**First transcription is slow:** +- The Whisper model downloads on first use (~500MB for "small" model) +- Subsequent transcriptions are much faster +- You can monitor progress in the logs: `tail -f ~/Library/Logs/Dictation.log` + +**Build fails with "Python 3.13 not found":** +- Install Python 3.13 via uv: `uv python install 3.13` +- Ensure uv is installed: `brew install uv` + ## License MIT diff --git a/build-swift.sh b/build-swift.sh index b21be7d..1addd1f 100755 --- a/build-swift.sh +++ b/build-swift.sh @@ -49,12 +49,16 @@ mkdir -p "${PYTHON_BUNDLE}/bin" mkdir -p "${PYTHON_BUNDLE}/lib" # Copy Python interpreter from uv's managed location -UV_PYTHON="$HOME/.local/share/uv/python/cpython-3.13.5-macos-aarch64-none" -if [ -d "${UV_PYTHON}" ]; then +# Find the latest Python 3.13.x installation +UV_PYTHON=$(ls -d "$HOME/.local/share/uv/python"/cpython-3.13.*-macos-aarch64-none 2>/dev/null | sort -V | tail -1) +if [ -n "${UV_PYTHON}" ] && [ -d "${UV_PYTHON}" ]; then # Copy just the bin and lib directories we need cp -R "${UV_PYTHON}/bin" "${PYTHON_BUNDLE}/" cp -R "${UV_PYTHON}/lib" "${PYTHON_BUNDLE}/" - echo "Copied Python 3.13 interpreter" + echo "Copied Python 3.13 interpreter from ${UV_PYTHON}" +else + echo "ERROR: Python 3.13 not found. Install with: uv python install 3.13" + exit 1 fi # Install ONLY mlx-whisper and its dependencies to a clean location