Add YouTube design concept extractor tool#432
Merged
wshobson merged 6 commits intowshobson:mainfrom Feb 7, 2026
Merged
Conversation
Extracts transcript, metadata, and keyframes from YouTube videos into a structured markdown reference document for agent consumption. Supports interval-based frame capture, scene-change detection, and chapter-aware transcript grouping. https://claude.ai/code/session_01KZxeSK9A2F2oZUoHgxUUBV
- Add --ocr flag with Tesseract (fast) or EasyOCR (stylized text) engines - Add --colors flag for dominant color palette extraction via ColorThief - Add --full convenience flag to enable all extraction features - Include OCR text alongside each frame in markdown output - Add Visual Text Index section for searchable on-screen text - Export ocr-results.json and color-palette.json for reuse - Run OCR in parallel with ThreadPoolExecutor for performance https://claude.ai/code/session_01KZxeSK9A2F2oZUoHgxUUBV
- requirements.txt with core and optional dependencies - Makefile with install, deps check, and run targets - Support for make run-full, run-ocr, run-transcript variants - Cross-platform install-ocr target (apt/brew/dnf) https://claude.ai/code/session_01KZxeSK9A2F2oZUoHgxUUBV
Now `make install-full` works from anywhere in the project. https://claude.ai/code/session_01KZxeSK9A2F2oZUoHgxUUBV
- Remove easyocr from install-full (requires PyTorch, causes conflicts) - Add separate install-easyocr target with CPU PyTorch from official index - Update requirements.txt with clear instructions for optional easyocr - Improve make deps output with clearer status messages https://claude.ai/code/session_01KZxeSK9A2F2oZUoHgxUUBV
…ctor - Check ffmpeg return codes instead of silently producing 0 frames - Add upfront shutil.which() checks for yt-dlp and ffmpeg - Narrow broad except Exception catches (transcript, OCR, color) - Log OCR errors instead of embedding error strings in output data - Handle subprocess.TimeoutExpired on all subprocess calls - Wrap video processing in try/finally for reliable cleanup - Error on missing easyocr when explicitly requested (no silent fallback) - Fix docstrings: 720p fallback, parallel OCR, chunk duration, deps - Split pytesseract/Pillow imports for clearer missing-dep messages - Add run-transcript to Makefile .PHONY and help target - Fix variable shadowing in round_color (step -> bucket_size) - Handle json.JSONDecodeError from yt-dlp metadata - Format with ruff
wshobson
approved these changes
Feb 7, 2026
Owner
wshobson
left a comment
There was a problem hiding this comment.
Looks great! I ran a comprehensive review and pushed fixes for the error handling and documentation issues I found. Here's what changed:
Error handling hardening:
- ffmpeg return codes are now checked (previously silently produced 0 frames on failure)
- Upfront
shutil.which()checks for yt-dlp and ffmpeg with actionable install messages subprocess.TimeoutExpiredhandled on all 4 subprocess calls with user-friendly errors- Video file cleanup wrapped in
try/finallyso large downloads don't leak on errors json.JSONDecodeErrorcaught for yt-dlp metadata parsing
Silent failure fixes:
- Narrowed broad
except Exceptionin transcript fetch to specific youtube-transcript-api errors - OCR errors now logged to console instead of embedded as fake text in output markdown
- Color extraction failures now logged (was completely silent
except Exception: return []) - Explicit
--ocr-engine easyocrnow errors if easyocr is missing (no silent Tesseract fallback)
Documentation/correctness:
- Module docstring now correctly classifies required vs optional deps (Pillow/pytesseract are optional at runtime)
- Fixed "720p max" docstring — format string falls back to best available
- Split pytesseract/Pillow imports so missing-dep messages are specific
- Fixed
run_ocr_on_framesdocstring (Tesseract is parallel, EasyOCR is sequential) - Fixed
group_transcriptdocstring ("at least" not "roughly") - Added
run-transcriptto Makefile.PHONYandhelpoutput - Fixed variable shadowing (
step→bucket_sizeinround_color) - Formatted with ruff
Nice tool — thanks for building it!
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Introduces a new command-line tool that extracts design concepts and visual references from YouTube videos into structured markdown documents. This tool automates the process of downloading video metadata, transcripts, and keyframes to create agent-ready reference materials.
Key Changes
tools/yt-design-extractor.py— a comprehensive YouTube video analysis utilityyoutube-transcript-apiwith automatic chunking and timestampingyt-dlpto gather title, description, chapters, duration, and tagsNotable Implementation Details
Dependencies
Requires:
yt-dlp,youtube-transcript-api,Pillow, andffmpeghttps://claude.ai/code/session_01KZxeSK9A2F2oZUoHgxUUBV