Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 46 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,31 +19,43 @@ A lightweight, local push-to-talk dictation app for macOS using OpenAI's Whisper

## Installation

### Prerequisites
### Swift Version (Current - Recommended)

The Swift version is faster, more native, and fully self-contained. No ffmpeg required!

**Prerequisites:**
- macOS 13.0+ (tested on macOS 15+)
- Xcode Command Line Tools (for Swift compiler)
- [uv](https://github.com/astral-sh/uv) package manager

### Swift Version (Current)
1. **Install prerequisites**:
```bash
# Install uv if you don't have it
brew install uv

The Swift version is faster, more native, and fully self-contained. No ffmpeg required!
# Install Python 3.13 (required for bundling mlx-whisper)
uv python install 3.13
```

1. **Clone and build**:
2. **Clone and build**:
```bash
git clone https://github.com/sayhar/dictation-app.git
cd dictation-app
./build-swift.sh
```

2. **Install the app**:
3. **Install the app**:
```bash
cp -R "dist/Swift Dictation.app" ~/Applications/
open ~/Applications/"Swift Dictation.app"
```

### Python Version (Legacy)

The original Python implementation.
The original Python implementation. Use the Swift version above for better performance.

**Prerequisites:**
- macOS 13.0+
- Python 3.12+
- [uv](https://github.com/astral-sh/uv) package manager
- ffmpeg
Expand Down Expand Up @@ -97,7 +109,7 @@ If the app doesn't request permissions automatically:
| Medium | ~1.5GB | Slower | Very Good |
| Large | ~3GB | Slowest | Best |

Models are automatically downloaded to `~/.cache/whisper/` on first use.
Models are automatically downloaded to `~/.cache/huggingface/` on first use.

## Technical Details

Expand All @@ -123,8 +135,21 @@ Standard Python keyboard libraries (like pynput) don't work properly in bundled

## Files

### Swift Version
- `Dictation/` - Swift source code directory
- `AppDelegate.swift` - Main app, menu bar UI, flow coordination
- `KeyboardMonitor.swift` - Right Command key detection
- `AudioRecorder.swift` - Audio recording via AVFoundation
- `TranscriptionService.swift` - Whisper transcription via Python subprocess
- `TextInjector.swift` - Clipboard and text injection handling
- `build-swift.sh` - Build script for creating the app bundle
- `Package.swift` - Swift Package Manager configuration

### Python Version (Legacy)
- `dictation.py` - Main application code
- `setup.py` - py2app build configuration

### Shared
- `create_icon.py` - Icon generation script
- `~/Library/Logs/Dictation.log` - Debug logs
- `~/Library/Logs/Dictation_Transcripts.log` - Long transcriptions (>30s)
Expand All @@ -140,6 +165,11 @@ System Settings → General → Login Items → Add Dictation.app
- Quit the app completely
- Re-launch and grant permissions fresh

**Permissions need to be re-granted after rebuild:**
- This is expected behavior. Each rebuild changes the app's code signature
- macOS requires re-granting Accessibility and Microphone permissions when the signature changes
- Simply approve the permission prompts when they appear

**Permissions show as "uv" or "Python":**
- This is normal when running via `uv run`
- Build with py2app for proper app attribution
Expand All @@ -149,6 +179,15 @@ System Settings → General → Login Items → Add Dictation.app
- Try removing and re-adding the app to permissions
- Check logs: `tail -f ~/Library/Logs/Dictation.log`

**First transcription is slow:**
- The Whisper model downloads on first use (~500MB for "small" model)
- Subsequent transcriptions are much faster
- You can monitor progress in the logs: `tail -f ~/Library/Logs/Dictation.log`

**Build fails with "Python 3.13 not found":**
- Install Python 3.13 via uv: `uv python install 3.13`
- Ensure uv is installed: `brew install uv`

## License

MIT
10 changes: 7 additions & 3 deletions build-swift.sh
Original file line number Diff line number Diff line change
Expand Up @@ -49,12 +49,16 @@ mkdir -p "${PYTHON_BUNDLE}/bin"
mkdir -p "${PYTHON_BUNDLE}/lib"

# Copy Python interpreter from uv's managed location
UV_PYTHON="$HOME/.local/share/uv/python/cpython-3.13.5-macos-aarch64-none"
if [ -d "${UV_PYTHON}" ]; then
# Find the latest Python 3.13.x installation
UV_PYTHON=$(ls -d "$HOME/.local/share/uv/python"/cpython-3.13.*-macos-aarch64-none 2>/dev/null | sort -V | tail -1)
if [ -n "${UV_PYTHON}" ] && [ -d "${UV_PYTHON}" ]; then
# Copy just the bin and lib directories we need
cp -R "${UV_PYTHON}/bin" "${PYTHON_BUNDLE}/"
cp -R "${UV_PYTHON}/lib" "${PYTHON_BUNDLE}/"
echo "Copied Python 3.13 interpreter"
echo "Copied Python 3.13 interpreter from ${UV_PYTHON}"
else
echo "ERROR: Python 3.13 not found. Install with: uv python install 3.13"
exit 1
fi

# Install ONLY mlx-whisper and its dependencies to a clean location
Expand Down