feat: multi-file upload with parallel transcription by shaulams · Pull Request #39 · shaulams/FieldCut

shaulams · 2026-04-11T10:42:58Z

Summary

Multi-file upload: Upload multiple interview recordings at once (different speakers/interviews) into a single project
Parallel transcription: Files are transcribed simultaneously via ThreadPoolExecutor (up to 4 at a time) for faster processing
Unified transcript: All transcripts merge into one view with time offsets, so clip selection and assembly work across all files
Backward compatible: Old projects with single source_file auto-migrate to source_files array on load

How it works

Upload zone now accepts multiple files (<input multiple>)
Backend saves all files, spawns parallel Whisper transcription threads
After all complete, transcripts merge with cumulative time offsets (file B starts where file A ends)
Each segment/word gets a source_index pointing to its source file
Clip cutting resolves the correct source file and real timestamps via resolve_source_for_clip()

Test plan

Upload a single file — verify transcription works exactly as before
Upload 2+ files — verify all get transcribed and transcript merges correctly
Mark clips across file boundaries — verify clip cutting produces correct audio
Load an old saved project — verify it loads without errors (backward compat)
Test with files that need compression (>25MB) — verify dynamic bitrate works

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Support multiple source files in project state alongside the existing single source_file field for backward compatibility. The resolver maps clip timestamps back to the correct source file for ffmpeg cutting. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Extract single-file transcription into reusable transcribe_single_file() and rewrite /transcribe to accept multiple files via getlist("file"). Files are transcribed in parallel using ThreadPoolExecutor (up to 4 workers), then merged with cumulative time offsets. Each segment and word gets a source_index field. Partial failures produce warnings without failing the whole job. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Shaul Amsterdamski and others added 8 commits April 11, 2026 12:39

docs: multi-file upload design for parallel transcription

47787ca

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

docs: multi-file upload implementation plan

792cfe3

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

feat: multi-file upload UI

6ce1f63

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

feat: backward compat for source_files in all routes

869589a

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

feat: unique speaker IDs across files and visual file dividers

74bf499

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

fix: combined waveform visualization for multi-file projects

62d10f3

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

shaulams merged commit c1372e0 into main Apr 11, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: multi-file upload with parallel transcription#39

feat: multi-file upload with parallel transcription#39
shaulams merged 8 commits into
mainfrom
feature/multi-file-upload

shaulams commented Apr 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

shaulams commented Apr 11, 2026

Summary

How it works

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant