Skip to content

Add YouTube design concept extractor tool#432

Merged
wshobson merged 6 commits intowshobson:mainfrom
bentheautomator:claude/extract-design-concepts-pEcDg
Feb 7, 2026
Merged

Add YouTube design concept extractor tool#432
wshobson merged 6 commits intowshobson:mainfrom
bentheautomator:claude/extract-design-concepts-pEcDg

Conversation

@bentheautomator
Copy link
Contributor

Summary

Introduces a new command-line tool that extracts design concepts and visual references from YouTube videos into structured markdown documents. This tool automates the process of downloading video metadata, transcripts, and keyframes to create agent-ready reference materials.

Key Changes

  • New tool: tools/yt-design-extractor.py — a comprehensive YouTube video analysis utility
  • Transcript extraction via youtube-transcript-api with automatic chunking and timestamping
  • Keyframe extraction with two strategies:
    • Regular interval-based capture (default: every 30 seconds)
    • Scene-change detection for visually dynamic content
  • Metadata collection using yt-dlp to gather title, description, chapters, duration, and tags
  • Markdown generation that produces a structured reference document with:
    • Video metadata and source attribution
    • Full transcript with timestamps (collapsible)
    • Condensed transcript segments for quick reference
    • Embedded keyframes with visual context
    • Frame index for easy navigation
  • Flexible CLI with options for output directory, frame interval, scene detection sensitivity, and transcript-only mode

Notable Implementation Details

  • Supports multiple YouTube URL formats (standard, short, embed, shorts)
  • Downloads video at 720p max to balance quality and file size
  • Automatically cleans up downloaded video after frame extraction
  • Groups transcript entries into logical chunks (default: 60-second segments)
  • Generates both human-readable markdown and raw metadata JSON
  • Includes comprehensive help text and usage examples
  • Gracefully handles missing transcripts and continues processing

Dependencies

Requires: yt-dlp, youtube-transcript-api, Pillow, and ffmpeg

https://claude.ai/code/session_01KZxeSK9A2F2oZUoHgxUUBV

claude and others added 6 commits February 5, 2026 03:13
Extracts transcript, metadata, and keyframes from YouTube videos
into a structured markdown reference document for agent consumption.

Supports interval-based frame capture, scene-change detection, and
chapter-aware transcript grouping.

https://claude.ai/code/session_01KZxeSK9A2F2oZUoHgxUUBV
- Add --ocr flag with Tesseract (fast) or EasyOCR (stylized text) engines
- Add --colors flag for dominant color palette extraction via ColorThief
- Add --full convenience flag to enable all extraction features
- Include OCR text alongside each frame in markdown output
- Add Visual Text Index section for searchable on-screen text
- Export ocr-results.json and color-palette.json for reuse
- Run OCR in parallel with ThreadPoolExecutor for performance

https://claude.ai/code/session_01KZxeSK9A2F2oZUoHgxUUBV
- requirements.txt with core and optional dependencies
- Makefile with install, deps check, and run targets
- Support for make run-full, run-ocr, run-transcript variants
- Cross-platform install-ocr target (apt/brew/dnf)

https://claude.ai/code/session_01KZxeSK9A2F2oZUoHgxUUBV
- Remove easyocr from install-full (requires PyTorch, causes conflicts)
- Add separate install-easyocr target with CPU PyTorch from official index
- Update requirements.txt with clear instructions for optional easyocr
- Improve make deps output with clearer status messages

https://claude.ai/code/session_01KZxeSK9A2F2oZUoHgxUUBV
…ctor

- Check ffmpeg return codes instead of silently producing 0 frames
- Add upfront shutil.which() checks for yt-dlp and ffmpeg
- Narrow broad except Exception catches (transcript, OCR, color)
- Log OCR errors instead of embedding error strings in output data
- Handle subprocess.TimeoutExpired on all subprocess calls
- Wrap video processing in try/finally for reliable cleanup
- Error on missing easyocr when explicitly requested (no silent fallback)
- Fix docstrings: 720p fallback, parallel OCR, chunk duration, deps
- Split pytesseract/Pillow imports for clearer missing-dep messages
- Add run-transcript to Makefile .PHONY and help target
- Fix variable shadowing in round_color (step -> bucket_size)
- Handle json.JSONDecodeError from yt-dlp metadata
- Format with ruff
Copy link
Owner

@wshobson wshobson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I ran a comprehensive review and pushed fixes for the error handling and documentation issues I found. Here's what changed:

Error handling hardening:

  • ffmpeg return codes are now checked (previously silently produced 0 frames on failure)
  • Upfront shutil.which() checks for yt-dlp and ffmpeg with actionable install messages
  • subprocess.TimeoutExpired handled on all 4 subprocess calls with user-friendly errors
  • Video file cleanup wrapped in try/finally so large downloads don't leak on errors
  • json.JSONDecodeError caught for yt-dlp metadata parsing

Silent failure fixes:

  • Narrowed broad except Exception in transcript fetch to specific youtube-transcript-api errors
  • OCR errors now logged to console instead of embedded as fake text in output markdown
  • Color extraction failures now logged (was completely silent except Exception: return [])
  • Explicit --ocr-engine easyocr now errors if easyocr is missing (no silent Tesseract fallback)

Documentation/correctness:

  • Module docstring now correctly classifies required vs optional deps (Pillow/pytesseract are optional at runtime)
  • Fixed "720p max" docstring — format string falls back to best available
  • Split pytesseract/Pillow imports so missing-dep messages are specific
  • Fixed run_ocr_on_frames docstring (Tesseract is parallel, EasyOCR is sequential)
  • Fixed group_transcript docstring ("at least" not "roughly")
  • Added run-transcript to Makefile .PHONY and help output
  • Fixed variable shadowing (stepbucket_size in round_color)
  • Formatted with ruff

Nice tool — thanks for building it!

@wshobson wshobson merged commit 5d65aa1 into wshobson:main Feb 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants