Skip to content

feat: add end-to-end Video Studio automation#55

Open
yuga-hashimoto wants to merge 3 commits into
mainfrom
feature/video-studio-e2e
Open

feat: add end-to-end Video Studio automation#55
yuga-hashimoto wants to merge 3 commits into
mainfrom
feature/video-studio-e2e

Conversation

@yuga-hashimoto

@yuga-hashimoto yuga-hashimoto commented Jun 25, 2026

Copy link
Copy Markdown
Owner

Summary

  • OSS researched in docs/video-studio/oss-research.md
  • architecture: local-first Video Studio project model with manifest, storyboard, assets, audio, captions, render plan, run log, metadata, and review output
  • tools: added the requested Video Studio MCP tools plus browser upload planning/execution support
  • renderer: builtin FFmpeg renderer, with generated scene PNGs and local mp4 output
  • TTS/audio: free local macOS say path with FFmpeg padding/muxing, plus FFmpeg silence fallback when needed
  • captions: generates .srt, .ass, and words.json; captions are baked into generated scene visuals for FFmpeg builds without libass/drawtext
  • dashboard: added Video Studio tab and Dashboard API routes using the same MCP tool pipeline, including Browser Upload Assist and explicit Publish controls
  • publishers: dry-run metadata, BrowserPublisherProvider-style upload automation with persistent browser profiles, UploadPost readiness path

OSS Research

  • Remotion
  • MoviePy
  • WhisperX
  • FFmpeg
  • Aegisub
  • OpenShorts / short-video-maker style OSS found during search

Validation

  • pnpm build
  • pnpm test
  • pnpm lint
  • pnpm video-studio:e2e

E2E Output

  • output.mp4 path: /Volumes/MOVESPEED/Documents/GitHub/chatgpt-local-mcp/.tmp-video-studio-e2e/localant-home/video-studio/projects/mqtel1u0-understand-localant-video-st/output/output.mp4
  • duration: 12
  • resolution: 1080x1920
  • hasAudio: true
  • hasVideo: true
  • thumbnail path: /Volumes/MOVESPEED/Documents/GitHub/chatgpt-local-mcp/.tmp-video-studio-e2e/localant-home/video-studio/projects/mqtel1u0-understand-localant-video-st/output/thumbnail.jpg
  • captions path: /Volumes/MOVESPEED/Documents/GitHub/chatgpt-local-mcp/.tmp-video-studio-e2e/localant-home/video-studio/projects/mqtel1u0-understand-localant-video-st/captions/output.srt
  • ASS path: /Volumes/MOVESPEED/Documents/GitHub/chatgpt-local-mcp/.tmp-video-studio-e2e/localant-home/video-studio/projects/mqtel1u0-understand-localant-video-st/captions/output.ass
  • storyboard path: /Volumes/MOVESPEED/Documents/GitHub/chatgpt-local-mcp/.tmp-video-studio-e2e/localant-home/video-studio/projects/mqtel1u0-understand-localant-video-st/storyboard.json
  • render plan path: /Volumes/MOVESPEED/Documents/GitHub/chatgpt-local-mcp/.tmp-video-studio-e2e/localant-home/video-studio/projects/mqtel1u0-understand-localant-video-st/render/render-plan.json

Known Limitations

  • Browser publishing is operational through Playwright when browser optional deps are installed (localant deps install browser). It opens platform upload pages, sets the MP4 file input, fills metadata with configurable selectors, and stops before submit unless confirmBrowserPublish=true.
  • Official YouTube/TikTok/Instagram API publishing still depends on OAuth, app review, account eligibility, and platform policy constraints.
  • Login, CAPTCHA, 2FA, and bot protection are never bypassed. The persistent browser profile exists so the user can log in normally once and reuse that state.
  • Video generation is intentionally free/local by default. Paid external video generation APIs are not used for the initial implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant