AudioScribe

A simple command-line tool that transcribes audio files to Markdown and plain text using either the OpenAI Whisper API or a local Whisper model.

Features

Two transcription modes — cloud via OpenAI Whisper API, or fully offline with a local model
Automatic chunking — files over 25MB are split with ffmpeg and reassembled seamlessly
Dual output — generates both .txt (raw text) and .md (with metadata header) for every transcription
Broad format support — .m4a, .mp3, .mp4, .wav, .ogg, .flac
.env support — reads OPENAI_API_KEY from a .env file in the script directory

Requirements

Python 3.10+
ffmpeg (required for chunking large files and for local Whisper)

# macOS
brew install ffmpeg

# Ubuntu/Debian
sudo apt install ffmpeg

OpenAI API mode (default)

pip install openai

Local mode

pip install openai-whisper

Quick Start

# Clone the repo
git clone https://github.com/youruser/audioscribe.git
cd audioscribe

# (Optional) Create a .env file with your API key
echo "OPENAI_API_KEY=sk-..." > .env

# Transcribe with the OpenAI API
python transcribe.py recording.m4a

# Transcribe offline with a local model
python transcribe.py recording.m4a --local

# Use a larger local model for better accuracy
python transcribe.py recording.m4a --local --model medium

# Specify an output directory
python transcribe.py recording.m4a -o ./transcripts/

Usage

python transcribe.py <audio_file> [options]

Option	Description
`--local`	Use a local Whisper model instead of the OpenAI API
`--model {tiny,base,small,medium,large}`	Local model size (default: `base`)
`--output-dir`, `-o`	Output directory (default: same as input file)

Model Size Reference

When using --local, larger models are more accurate but slower and use more memory:

Model	Parameters	Relative Speed	English Accuracy
`tiny`	39M	~32x	Good
`base`	74M	~16x	Better
`small`	244M	~6x	Great
`medium`	769M	~2x	Excellent
`large`	1550M	1x	Best

Output

Each transcription produces two files alongside the source audio (or in the directory specified with -o):

recording.m4a
recording.txt          # Raw transcription text
recording.md           # Markdown with metadata header

The Markdown file includes a header with the source filename and timestamp:

# Transcription: recording

**Source:** `recording.m4a`
**Transcribed:** 2026-04-14 10:30:00

---

The transcribed text appears here...

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
README.md		README.md
transcribe.py		transcribe.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AudioScribe

Features

Requirements

OpenAI API mode (default)

Local mode

Quick Start

Usage

Model Size Reference

Output

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AudioScribe

Features

Requirements

OpenAI API mode (default)

Local mode

Quick Start

Usage

Model Size Reference

Output

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages