Enhance NotebookLM integration and frontend video ingestion pipeline#51
Enhance NotebookLM integration and frontend video ingestion pipeline#51groupthinking merged 71 commits intov0/groupthinking-86bedbf8from
Conversation
…deo_with_notebooklm` MCP tool.
…al context documentation and video asset.
…automation, archive numerous scripts and documentation, and update local browser profile data.
…nd service worker data.
…ate associated profile data.
…safe browsing data.
…updates Bumps the npm_and_yarn group with 3 updates in the / directory: [ajv](https://github.com/ajv-validator/ajv), [hono](https://github.com/honojs/hono) and [qs](https://github.com/ljharb/qs). Bumps the npm_and_yarn group with 3 updates in the /docs/knowledge_prototypes/mcp-servers/fetch-mcp directory: [@modelcontextprotocol/sdk](https://github.com/modelcontextprotocol/typescript-sdk), [ajv](https://github.com/ajv-validator/ajv) and [hono](https://github.com/honojs/hono). Bumps the npm_and_yarn group with 1 update in the /scripts/archive/software-on-demand directory: [ajv](https://github.com/ajv-validator/ajv). Bumps the npm_and_yarn group with 2 updates in the /scripts/archive/supabase_cleanup directory: [next](https://github.com/vercel/next.js) and [qs](https://github.com/ljharb/qs). Updates `ajv` from 8.17.1 to 8.18.0 - [Release notes](https://github.com/ajv-validator/ajv/releases) - [Commits](ajv-validator/ajv@v8.17.1...v8.18.0) Updates `hono` from 4.11.7 to 4.12.1 - [Release notes](https://github.com/honojs/hono/releases) - [Commits](honojs/hono@v4.11.7...v4.12.1) Updates `qs` from 6.14.1 to 6.15.0 - [Changelog](https://github.com/ljharb/qs/blob/main/CHANGELOG.md) - [Commits](ljharb/qs@v6.14.1...v6.15.0) Updates `@modelcontextprotocol/sdk` from 1.25.2 to 1.26.0 - [Release notes](https://github.com/modelcontextprotocol/typescript-sdk/releases) - [Commits](modelcontextprotocol/typescript-sdk@v1.25.2...v1.26.0) Updates `ajv` from 8.17.1 to 8.18.0 - [Release notes](https://github.com/ajv-validator/ajv/releases) - [Commits](ajv-validator/ajv@v8.17.1...v8.18.0) Updates `hono` from 4.11.5 to 4.12.1 - [Release notes](https://github.com/honojs/hono/releases) - [Commits](honojs/hono@v4.11.7...v4.12.1) Updates `qs` from 6.14.1 to 6.15.0 - [Changelog](https://github.com/ljharb/qs/blob/main/CHANGELOG.md) - [Commits](ljharb/qs@v6.14.1...v6.15.0) Updates `ajv` from 8.17.1 to 8.18.0 - [Release notes](https://github.com/ajv-validator/ajv/releases) - [Commits](ajv-validator/ajv@v8.17.1...v8.18.0) Updates `next` from 15.4.10 to 15.5.10 - [Release notes](https://github.com/vercel/next.js/releases) - [Changelog](https://github.com/vercel/next.js/blob/canary/release.js) - [Commits](vercel/next.js@v15.4.10...v15.5.10) Updates `qs` from 6.14.1 to 6.15.0 - [Changelog](https://github.com/ljharb/qs/blob/main/CHANGELOG.md) - [Commits](ljharb/qs@v6.14.1...v6.15.0) --- updated-dependencies: - dependency-name: ajv dependency-version: 8.18.0 dependency-type: indirect dependency-group: npm_and_yarn - dependency-name: hono dependency-version: 4.12.1 dependency-type: indirect dependency-group: npm_and_yarn - dependency-name: qs dependency-version: 6.15.0 dependency-type: indirect dependency-group: npm_and_yarn - dependency-name: "@modelcontextprotocol/sdk" dependency-version: 1.26.0 dependency-type: direct:production dependency-group: npm_and_yarn - dependency-name: ajv dependency-version: 8.18.0 dependency-type: indirect dependency-group: npm_and_yarn - dependency-name: hono dependency-version: 4.12.1 dependency-type: indirect dependency-group: npm_and_yarn - dependency-name: qs dependency-version: 6.15.0 dependency-type: indirect dependency-group: npm_and_yarn - dependency-name: ajv dependency-version: 8.18.0 dependency-type: direct:production dependency-group: npm_and_yarn - dependency-name: next dependency-version: 15.5.10 dependency-type: direct:production dependency-group: npm_and_yarn - dependency-name: qs dependency-version: 6.15.0 dependency-type: indirect dependency-group: npm_and_yarn ... Signed-off-by: dependabot[bot] <support@github.com>
…_and_yarn-c7af963e99 chore(deps): bump the npm_and_yarn group across 4 directories with 5 updates
Bumps the npm_and_yarn group with 1 update in the /scripts/archive/supabase_cleanup directory: [minimatch](https://github.com/isaacs/minimatch). Updates `minimatch` from 3.1.2 to 3.1.4 - [Changelog](https://github.com/isaacs/minimatch/blob/main/changelog.md) - [Commits](isaacs/minimatch@v3.1.2...v3.1.4) --- updated-dependencies: - dependency-name: minimatch dependency-version: 3.1.4 dependency-type: indirect dependency-group: npm_and_yarn ... Signed-off-by: dependabot[bot] <support@github.com>
…ipts/archive/supabase_cleanup/npm_and_yarn-c7796958eb chore(deps): bump minimatch from 3.1.2 to 3.1.4 in /scripts/archive/supabase_cleanup in the npm_and_yarn group across 1 directory
…update Bumps the npm_and_yarn group with 1 update in the / directory: [hono](https://github.com/honojs/hono). Bumps the npm_and_yarn group with 1 update in the /docs/knowledge_prototypes/mcp-servers/fetch-mcp directory: [hono](https://github.com/honojs/hono). Updates `hono` from 4.12.1 to 4.12.2 - [Release notes](https://github.com/honojs/hono/releases) - [Commits](honojs/hono@v4.12.1...v4.12.2) Updates `hono` from 4.12.1 to 4.12.2 - [Release notes](https://github.com/honojs/hono/releases) - [Commits](honojs/hono@v4.12.1...v4.12.2) --- updated-dependencies: - dependency-name: hono dependency-version: 4.12.2 dependency-type: indirect dependency-group: npm_and_yarn - dependency-name: hono dependency-version: 4.12.2 dependency-type: indirect dependency-group: npm_and_yarn ... Signed-off-by: dependabot[bot] <support@github.com>
…ment The core pipeline previously required the Python backend to be running. When deployed to Vercel (https://v0-uvai.vercel.app/), the backend is unavailable, causing all video analysis to fail immediately. Changes: - /api/video: Falls back to frontend-only pipeline (transcribe + extract) when the Python backend is unreachable, with 15s timeout - /api/transcribe: Adds Gemini fallback when OpenAI is unavailable, plus 8s timeout on backend probe to avoid hanging on Vercel - layout.tsx: Loads Google Fonts via <link> instead of next/font/google to avoid build failures in offline/sandboxed CI environments - page.tsx: Replace example URLs with technical content (3Blue1Brown neural networks, Karpathy LLM intro) instead of rick roll / zoo videos - gemini_service.py: Gate Vertex AI import behind GOOGLE_CLOUD_PROJECT env var to prevent 30s+ hangs on the GCE metadata probe - agent_gap_analyzer.py: Fix f-string backslash syntax errors (Python 3.11) https://claude.ai/code/session_015Pd3a6hinTenCNrPRGiZqE
…orgery Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
…orgery Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
…LE_VERTEX_AI boolean parsing Co-authored-by: groupthinking <154503486+groupthinking@users.noreply.github.com>
…E_VERTEX_AI boolean parsing Co-authored-by: groupthinking <154503486+groupthinking@users.noreply.github.com>
Co-authored-by: vercel[bot] <35613825+vercel[bot]@users.noreply.github.com>
Co-authored-by: vercel[bot] <35613825+vercel[bot]@users.noreply.github.com>
Co-authored-by: vercel[bot] <35613825+vercel[bot]@users.noreply.github.com>
Fix clearTimeout timer leaks in AbortController fetch patterns
Fix clearTimeout leaks, transcript_segments shape mismatch, and ENABLE_VERTEX_AI boolean parsing
Fix timeout leak, transcript_segments shape mismatch, and ENABLE_VERTEX_AI boolean guard
Skip backend calls entirely when BACKEND_URL is not configured or
contains an invalid value (like a literal ${...} template string).
This prevents URL parse errors on Vercel where the env var may not
be set.
https://claude.ai/code/session_015Pd3a6hinTenCNrPRGiZqE
…pdate-R47Ph fix: validate BACKEND_URL before using it
- Create stub types for Firebase Data Connect SDK in src/dataconnect-generated/ - Fix import path from ../dataconnect-generated to ./dataconnect-generated (rootDir constraint) - Add explicit type assertions for JSON responses (predictions, access_token) - All 6 TypeScript errors resolved, clean build verified Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* chore: Update generated Chrome profile cache and session data for notebooklm. * chore: refresh notebooklm Chrome profile data, including Safe Browsing lists, caches, and session files. * Update local application cache and database files within the NotebookLM Chrome profile. * chore: update Chrome profile cache and Safe Browsing data files. * feat: upgrade Gemini to @google/genai SDK with structured output, search grounding, video URL processing, and extend VideoPack schema - Upgrade extract-events/route.ts from @google/generative-ai to @google/genai - Add Gemini responseSchema with Type system for structured output enforcement - Add Google Search grounding (googleSearch tool) to Gemini calls - Upgrade transcribe/route.ts to @google/genai with direct YouTube URL processing via fileData - Add Gemini video URL fallback chain: direct video → text+search → other strategies - Extend VideoPackV0 schema with Chapter, CodeCue, Task models - Update versioning shim for new fields - Export new types from videopack __init__ Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add TypeScript CloudEvents publisher (apps/web/src/lib/cloudevents.ts) emitting standardized events at each video processing stage - Wire CloudEvents into /api/video route (both backend + frontend strategies) - Wire CloudEvents into FastAPI backend router (process_video_v1 endpoint) - Add Chrome Built-in AI service (Prompt API + Summarizer API) for on-device client-side transcript analysis when API keys are unavailable - Add useBuiltInAI React hook for component integration - Add .next/ to .gitignore Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add A2AContextMessage dataclass to AgentOrchestrator for lightweight inter-agent context sharing during parallel task execution - Auto-broadcast agent results to peer agents after parallel execution - Add send_a2a_message() and get_a2a_log() methods to orchestrator - Add POST /api/v1/agents/a2a/send endpoint for frontend-to-agent messaging - Add GET /api/v1/agents/a2a/log endpoint to query message history - Extend frontend agentService with sendA2AMessage() and getA2ALog() Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add setup.sh to download lit CLI binary and .litertlm model - Support macOS arm64 and x86_64 architectures - Auto-generate .env with LIT_BINARY_PATH and LIT_MODEL_PATH - Add .gitignore for bin/, models/, .env - Update README with Quick Setup section Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…nding (#47) - Create gemini-video-analyzer.ts: single Gemini call with googleSearch tool for transcript extraction AND event analysis (PK=998 pattern) - Add youtube-metadata.ts: scrapes title, description, chapters from YouTube without API key - Update /api/video: Gemini agentic analysis as primary strategy, transcribe→extract chain as fallback - Fix /api/transcribe: remove broken fileData.fileUri, use Gemini Google Search grounding as primary, add metadata context, filter garbage OpenAI results - Fix /api/extract-events: accept videoUrl without requiring transcript, direct Gemini analysis via Google Search when no transcript available Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Create shared gemini-client.ts that resolves API key from: GEMINI_API_KEY → GOOGLE_API_KEY → Vertex_AI_API_KEY All API routes now use the shared client instead of hardcoding process.env.GEMINI_API_KEY. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When only Vertex_AI_API_KEY is set (no GEMINI_API_KEY), the client now initializes in Vertex AI mode with vertexai: true + apiKey. Uses project uvai-730bb and us-central1 as defaults. Also added GOOGLE_CLOUD_PROJECT env var to Vercel. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…gleSearch conflict (#48) Vertex AI does not support controlled generation (responseSchema) combined with the googleSearch tool. This caused 400 errors on every Gemini call. Changes: - gemini-client.ts: Prioritize Vertex_AI_API_KEY, support GOOGLE_GENAI_USE_VERTEXAI env var - gemini-video-analyzer.ts: Remove responseSchema, enforce JSON via prompt instructions - extract-events/route.ts: Same fix for extractWithGemini and inline Gemini calls - Strip markdown code fences from responses before JSON parsing Tested end-to-end with Vertex AI Express Mode key against multiple YouTube videos. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…mini-3-pro-preview (#49) The previous fix (PR #48) was a shortcut — it removed responseSchema when the real issue was using gemini-2.5-flash which doesn't support responseSchema + googleSearch together on Vertex AI. gemini-3-pro-preview DOES support the combination. This commit restores the exact PK=998 pattern: - gemini-video-analyzer.ts: Restored responseSchema with Type system, responseMimeType, e22Snippets field, model → gemini-3-pro-preview - extract-events/route.ts: Restored geminiResponseSchema, Type import, responseMimeType, model → gemini-3-pro-preview - transcribe/route.ts: model → gemini-3-pro-preview Tested with Vertex AI Express Mode key on two YouTube videos. Both return structured JSON with events, transcript, actions, codeMapping, cloudService, e22Snippets, architectureCode, ingestScript. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add /api/pipeline route for full end-to-end pipeline (video analysis → code generation → GitHub repo → Vercel deploy) - Add deployPipeline() action to dashboard store with stage tracking - Add 🚀 Deploy button to dashboard alongside Analyze - Show pipeline results (live URL, GitHub repo, framework) in video cards - Fix deployment_manager import path in video_processing_service - Wire pipeline to backend /api/v1/video-to-software endpoint - Fallback to Gemini-only analysis when no backend available Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Create /app/generated_projects, /app/youtube_processed_videos, and /tmp/uvai_data directories in Dockerfile to fix permission denied errors in the deployment and video processing pipeline on Railway. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- CORS: replace wildcard/glob with explicit allowed origins in both entry points - Rate limiting: enable 60 req/min with 15 burst on backend - API auth: add optional X-API-Key middleware for pipeline endpoints - Codegen: generate video-specific HTML/CSS/JS from analysis output - API: accept both 'url' and 'video_url' via Pydantic alias - Deploy: fix Vercel REST API payload format (gitSource instead of gitRepository) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Root causes fixed: - Case mismatch in _poll_deployment_status: compared lowercased status against uppercase success_statuses list, so READY was never matched - Vercel API returns bare domain URLs without https:// prefix; added _ensure_https() to normalize them - Poll requests were missing auth headers, causing 401 failures - _deploy_files_directly fallback returned fake simulated URLs that masked real failures; removed in favor of proper error reporting - _generate_deployment_urls only returned URLs from 'success' status deployments, discarding useful fallback URLs from failed deployments Improvements: - On API failure (permissions, plan limits), return a Vercel import URL the user can click to deploy manually instead of an empty string - Support VERCEL_ORG_ID team scoping on deploy and poll endpoints - Use readyState field (Vercel v13 API) for initial status check - Add 'canceled' to failure status list in poll loop - Poll failures are now non-fatal; initial URL is used as fallback Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…aders - Add uvaiio.vercel.app to CORS allowed origins - Add slowapi rate limiting (60 req/min) - Add API key auth middleware (optional via EVENTRELAY_API_KEY) - Add security headers (X-Content-Type-Options, X-Frame-Options, X-XSS-Protection) - Fixes production gap where slim main.py had none of the backend/main.py protections Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…eploy The VideoToSoftwareRequest model had both 'model_config = ConfigDict(...)' and 'class Config:' which Pydantic v2 rejects. Merged into single model_config. This was causing the v1 router to fail loading, making /api/v1/health return 404. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
99df2ac
into
v0/groupthinking-86bedbf8
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly upgrades the application's video analysis and software generation capabilities. It establishes a comprehensive end-to-end pipeline that transforms YouTube videos into deployed software, integrating advanced AI models like Gemini and OpenAI for intelligent content extraction and analysis. The changes focus on enhancing reliability through multi-provider strategies, improving transcript accuracy, and providing a more robust and observable workflow for video-driven development. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Pull request overview
Improves the frontend/serverless video ingestion pipeline with multi-strategy transcript/event extraction, adds an end-to-end “video → deploy” pipeline endpoint, and introduces Gemini/Chrome Built-in AI integrations plus supporting config/deps.
Changes:
- Added
/api/pipelineendpoint and frontend store/UI support for end-to-end deployment results - Implemented Gemini-based agentic video analysis (googleSearch grounding) and multi-strategy transcript/extraction fallbacks
- Added CloudEvents publishing, Gemini client utilities, and updated dependencies/configuration
Reviewed changes
Copilot reviewed 21 out of 12310 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| dataconnect/.dataconnect/pgliteData/pg17/PG_VERSION | Adds a local pglite Postgres version file under dataconnect state |
| apps/web/src/store/dashboard-store.ts | Adds pipeline result types and a deployPipeline workflow in the dashboard store |
| apps/web/src/lib/youtube-metadata.ts | Adds YouTube metadata scraping + chapter parsing utilities |
| apps/web/src/lib/services/builtin-ai.ts | Adds Chrome Built-in AI client-side fallback helpers |
| apps/web/src/lib/services/agent-service.ts | Adds A2A send/log API calls to the agent service |
| apps/web/src/lib/gemini-video-analyzer.ts | Adds Gemini agentic video analyzer with responseSchema + googleSearch |
| apps/web/src/lib/gemini-client.ts | Adds shared Gemini client factory with key/mode resolution |
| apps/web/src/lib/cloudevents.ts | Adds CloudEvents event creation + publishing helper |
| apps/web/src/hooks/use-builtin-ai.ts | Adds React hook for Built-in AI capability checks and helpers |
| apps/web/src/app/page.tsx | Updates example YouTube URLs shown on the homepage |
| apps/web/src/app/layout.tsx | Switches from next/font to Google Fonts to avoid build failures |
| apps/web/src/app/dashboard/page.tsx | Updates dashboard UI to support pipeline deployments + show pipeline results |
| apps/web/src/app/api/video/route.ts | Adds multi-strategy backend/Gemini/frontend-chain analysis with CloudEvents |
| apps/web/src/app/api/transcribe/route.ts | Refactors transcription into multi-strategy backend/Gemini/OpenAI/STT flow with metadata |
| apps/web/src/app/api/pipeline/route.ts | Adds end-to-end pipeline endpoint with backend-first + Gemini analysis fallback |
| apps/web/src/app/api/extract-events/route.ts | Adds Gemini structured extraction via @google/genai schema + direct videoUrl extraction |
| apps/web/package.json | Adds @google/genai dependency |
| Dockerfile | Creates/chowns additional data directories used by the pipeline |
| .vscode/settings.json | Enables Gemini Code Assist automatic outline generation |
| .firebaserc | Adds Firebase project configuration |
| .firebase/.graphqlrc | Adds GraphQL schema/document configuration for Data Connect |
| @@ -0,0 +1 @@ | |||
| 17 | |||
There was a problem hiding this comment.
This file appears to be generated local state for pglite/dataconnect (embedded DB data directory). Committing it will likely create noisy diffs and environment-specific state in the repo. Consider removing it from version control and adding the broader .dataconnect/pgliteData/ path to .gitignore (or storing DB state outside the repo).
| 17 | |
| # Placeholder file; do not commit pglite/dataconnect DB state. | |
| # The real PG_VERSION file should be generated locally by the embedded DB. |
| export function extractVideoId(url: string): string | null { | ||
| const patterns = [ | ||
| /(?:youtube\.com\/watch\?v=)([a-zA-Z0-9_-]{11})/, | ||
| /(?:youtu\.be\/)([a-zA-Z0-9_-]{11})/, | ||
| /(?:youtube\.com\/embed\/)([a-zA-Z0-9_-]{11})/, | ||
| /(?:youtube\.com\/shorts\/)([a-zA-Z0-9_-]{11})/, | ||
| ]; |
There was a problem hiding this comment.
The watch URL regex only matches youtube.com/watch?v=... when v is the first query param. Common URLs like .../watch?si=...&v=VIDEOID or .../watch?time_continue=1&v=VIDEOID won’t match, causing metadata fetch to fail for valid YouTube links. Consider broadening the pattern to allow other params in any order (e.g., match [?&]v=).
| * When no backend is configured the events are written to a local | ||
| * JSONL file (`/tmp/cloudevents.jsonl`) for observability. |
There was a problem hiding this comment.
The module docstring claims events are appended to /tmp/cloudevents.jsonl when no backend is configured, but publishEvent() currently only logs to the console when CLOUDEVENTS_WEBHOOK_URL is unset. Either implement the JSONL append (server-side only), or update the documentation to match the current behavior.
| export async function publishEvent( | ||
| type: string, | ||
| data: Record<string, unknown>, | ||
| subject?: string, | ||
| ): Promise<void> { |
There was a problem hiding this comment.
The module docstring claims events are appended to /tmp/cloudevents.jsonl when no backend is configured, but publishEvent() currently only logs to the console when CLOUDEVENTS_WEBHOOK_URL is unset. Either implement the JSONL append (server-side only), or update the documentation to match the current behavior.
| updateVideo(id, { | ||
| status: result.status === 'success' || result.status === 'complete' ? 'complete' : 'failed', | ||
| progress: 100, | ||
| title: `Deployed: ${url.length > 40 ? url.substring(0, 37) + '…' : url}`, |
There was a problem hiding this comment.
When the pipeline returns an OK response but indicates a non-success status, the video is marked failed but still shows progress: 100 and a Deployed: title. This produces misleading UI for failed runs. Consider setting the title/progress based on the computed status (e.g., only use Deployed:/100% when status is complete, otherwise use a failure title and a non-100 progress).
| updateVideo(id, { | |
| status: result.status === 'success' || result.status === 'complete' ? 'complete' : 'failed', | |
| progress: 100, | |
| title: `Deployed: ${url.length > 40 ? url.substring(0, 37) + '…' : url}`, | |
| const isSuccess = result.status === 'success' || result.status === 'complete'; | |
| const computedStatus = isSuccess ? 'complete' : 'failed'; | |
| const progress = isSuccess ? 100 : 90; | |
| const truncatedUrl = url.length > 40 ? url.substring(0, 37) + '…' : url; | |
| const titlePrefix = isSuccess ? 'Deployed' : 'Deployment failed'; | |
| updateVideo(id, { | |
| status: computedStatus, | |
| progress, | |
| title: `${titlePrefix}: ${truncatedUrl}`, |
| * 2. OpenAI Responses API with web_search (finds transcripts online) | ||
| * 3. Gemini fallback (if OpenAI unavailable) | ||
| * 4. Direct audio STT via OpenAI Whisper |
There was a problem hiding this comment.
The documented strategy order doesn’t match the implementation: the code attempts Gemini w/ googleSearch before OpenAI web_search. Also the final strategy uses gpt-4o-mini-transcribe (not Whisper). Updating the comment to reflect the actual order/tools would prevent confusion for future maintainers.
| * 2. OpenAI Responses API with web_search (finds transcripts online) | |
| * 3. Gemini fallback (if OpenAI unavailable) | |
| * 4. Direct audio STT via OpenAI Whisper | |
| * 2. Gemini with Google Search grounding (primary for YouTube transcripts) | |
| * 3. OpenAI Responses API with web_search (fallback when Gemini or Google Search are unavailable) | |
| * 4. Direct audio STT via OpenAI gpt-4o-mini-transcribe |
| return NextResponse.json({ | ||
| success: false, | ||
| error: 'No AI API key configured. Set OPENAI_API_KEY or GEMINI_API_KEY.', | ||
| error: 'No AI API key configured or all extraction attempts failed. Set GEMINI_API_KEY.', |
There was a problem hiding this comment.
This error message is misleading because extraction can succeed with OPENAI_API_KEY alone, and failure here may also reflect provider/runtime errors (not only missing Gemini). Consider revising the message to mention both keys (OPENAI_API_KEY/GEMINI_API_KEY) and/or distinguish between “no provider configured” vs “all providers failed” based on the environment and attempted paths.
| error: 'No AI API key configured or all extraction attempts failed. Set GEMINI_API_KEY.', | |
| error: | |
| 'No AI providers succeeded. Either no API keys are configured (missing OPENAI_API_KEY and/or GEMINI_API_KEY) or all extraction attempts failed at runtime.', |
There was a problem hiding this comment.
Code Review
This pull request introduces a significant set of features, enhancing the video analysis pipeline with multi-provider support (OpenAI and Gemini) and robust fallback strategies, establishing a new end-to-end pipeline from video to deployed software. However, it introduces critical security concerns, including a Server-Side Request Forgery (SSRF) vulnerability in the transcription API and multiple Prompt Injection points in the AI-driven analysis logic. Additionally, a critical issue with unsafe JSON parsing that could lead to crashes, a high-severity bug due to a typo in a model name, and several medium-severity suggestions for improving ID generation and type safety have been identified. These issues should be addressed to ensure the application's robustness and security.
| const text = response.text ?? ''; | ||
| return JSON.parse(text); |
There was a problem hiding this comment.
This direct call to JSON.parse() is unsafe. If response.text from the Gemini API is an empty string (which is possible if the model returns no content), this will throw an unhandled exception and cause the API route to crash with a 500 error. You should gracefully handle the case of an empty or invalid JSON string before parsing.
| const text = response.text ?? ''; | |
| return JSON.parse(text); | |
| const text = response.text ?? ''; | |
| return text ? JSON.parse(text) : {}; |
| }); | ||
|
|
||
| const resultText = response.text || '{}'; | ||
| return JSON.parse(resultText) as VideoAnalysisResult; |
There was a problem hiding this comment.
The JSON.parse() call is unsafe. If the resultText from the AI is an empty string or not valid JSON, this will throw an unhandled exception. Given that this is a core analysis function, it should be more robust. Please wrap the parsing in a try...catch block to handle potential errors gracefully.
try {
return JSON.parse(resultText) as VideoAnalysisResult;
} catch (e) {
console.error('Failed to parse VideoAnalysisResult from Gemini:', resultText, e);
throw new Error('Failed to parse JSON response from Gemini for video analysis.');
}| if (audioUrl) { | ||
| // Strategy 4: Direct audio file transcription via OpenAI Whisper | ||
| if (audioUrl && process.env.OPENAI_API_KEY) { | ||
| const audioResponse = await fetch(audioUrl); |
There was a problem hiding this comment.
The audioUrl parameter is taken directly from the user request and used in a fetch call without any validation or sanitization. This allows an attacker to make the server perform arbitrary network requests, potentially reaching internal services or metadata endpoints (SSRF).
| const audioResponse = await fetch(audioUrl); | |
| // Validate audioUrl before fetching | |
| try { | |
| const parsedUrl = new URL(audioUrl); | |
| if (!['http:', 'https:'].includes(parsedUrl.protocol)) { | |
| throw new Error('Invalid protocol'); | |
| } | |
| // Add additional checks for internal IP ranges if necessary | |
| } catch (e) { | |
| return NextResponse.json({ error: 'Invalid audioUrl' }, { status: 400 }); | |
| } | |
| const audioResponse = await fetch(audioUrl); |
| const systemInstruction = buildSystemInstruction(videoUrl); | ||
|
|
||
| const response = await ai.models.generateContent({ | ||
| model: 'gemini-3-pro-preview', |
There was a problem hiding this comment.
The model name gemini-3-pro-preview appears to be a typo and is not a valid Google Gemini model name. This will cause all API calls using this function to fail. You likely intended to use a model from the Gemini 1.5 family, such as gemini-1.5-pro-latest or a specific preview version. Please correct the model name to a valid one.
model: 'gemini-1.5-pro-latest',| * Implements the Think → Act → Observe → Map loop from PK=998. | ||
| */ | ||
| function buildSystemInstruction(videoUrl: string): string { | ||
| const videoId = videoUrl.match(/[?&]v=([^&]+)/)?.[1] || videoUrl; |
There was a problem hiding this comment.
The videoUrl parameter is directly embedded into the systemInstruction for the Gemini model. An attacker can provide a malicious URL containing instructions to override the system prompt and manipulate the LLM's behavior (Prompt Injection). This is particularly dangerous as the googleSearch tool is enabled, which could be abused to perform arbitrary searches.
const videoIdMatch = videoUrl.match(/[?&]v=([^&]+)/);
const videoId = videoIdMatch ? videoIdMatch[1] : 'Unknown';
if (!videoIdMatch) {
// Handle invalid URL or log warning
}| }, url); | ||
|
|
||
| return NextResponse.json({ | ||
| id: `pipeline_${Date.now().toString(36)}`, |
There was a problem hiding this comment.
Using Date.now().toString(36) to generate an ID is not guaranteed to be unique, especially if multiple requests are processed concurrently. This could lead to collisions and unexpected behavior. It's better to use a cryptographically secure random UUID to ensure uniqueness.
| id: `pipeline_${Date.now().toString(36)}`, | |
| id: `pipeline_${crypto.randomUUID()}`, | |
| }, url); | ||
|
|
||
| return NextResponse.json({ | ||
| id: `pipeline_${Date.now().toString(36)}`, |
There was a problem hiding this comment.
Using Date.now().toString(36) to generate an ID is not guaranteed to be unique, especially if multiple requests are processed concurrently. This could lead to collisions and unexpected behavior. It's better to use a cryptographically secure random UUID to ensure uniqueness.
| id: `pipeline_${Date.now().toString(36)}`, | |
| id: `pipeline_${crypto.randomUUID()}`, | |
| console.error('Transcript extraction failed:', e); | ||
| } | ||
|
|
||
| let extraction: { events?: Array<{ type: string; title: string; description?: string; timestamp?: string; priority?: string }>; actions?: Array<{ title: string }>; summary?: string; topics?: string[] } = {}; |
There was a problem hiding this comment.
The type for the extraction variable is defined inline. According to the repository style guide, functions should have strict type hinting, which is better achieved with clear, reusable, and named types. Defining complex types inline can reduce readability and reusability.
Consider extracting this into a named interface or type alias, for example:
interface ExtractionResult {
events?: Array<{ type: string; title: string; description?: string; timestamp?: string; priority?: string }>;
actions?: Array<{ title: string }>;
summary?: string;
topics?: string[];
}
// ... later in the function
let extraction: ExtractionResult = {};References
- All functions must have strict type hinting. Using a named type alias or interface for complex objects improves clarity and reusability, which aligns with the principle of strict typing. (link)
| // ── Full end-to-end pipeline: YouTube URL → deployed software ── | ||
| deployPipeline: async (url) => { | ||
| const { addVideo, updateVideo, addActivity } = get(); | ||
| const id = Date.now().toString(); |
There was a problem hiding this comment.
Using Date.now().toString() for the id is not guaranteed to be unique. This can lead to issues in React when this ID is used as a key for list items and for state updates, especially if multiple actions are dispatched in quick succession. It's safer to use crypto.randomUUID() to ensure uniqueness.
| const id = Date.now().toString(); | |
| const id = crypto.randomUUID(); | |
This pull request introduces significant improvements to the application's AI-driven video analysis pipeline, enhances transcript extraction strategies, and updates dependencies and configuration files to support new features. The main focus is on enabling robust, multi-strategy extraction of structured data from YouTube videos, with graceful fallback between OpenAI and Gemini providers, and providing a full end-to-end pipeline for video-to-software generation.
Key changes:
1. End-to-End Pipeline and Video Analysis Enhancements
/api/pipelineendpoint implementing an end-to-end pipeline: YouTube URL → Video Analysis → Code Generation → Deployment → Live URL, with fallback between a backend FastAPI pipeline and Gemini-based analysis if the backend is unavailable./api/extract-events/route.tsto support both OpenAI and Gemini for structured event extraction, including a new Gemini schema using the@google/genaiType system and direct video analysis via Google Search when only a video URL is provided. [1] [2] [3]2. Transcript Extraction Improvements
/api/transcribe/route.tsto use a multi-strategy approach for transcript extraction: tries backend YouTube transcript API, then Gemini with Google Search, then OpenAI web search, and finally direct audio transcription, with YouTube metadata enrichment for better context. [1] [2] [3]3. Dependency and Configuration Updates
@google/genaias a new dependency inapps/web/package.jsonto enable advanced Gemini features..firebase/.graphqlrcand.firebasercconfiguration files for improved project setup and GraphQL schema management. [1] [2]4. Developer Tooling
.vscode/settings.jsonto enable automatic outline generation for Gemini Code Assist, enhancing the developer experience.These changes collectively improve the reliability, flexibility, and capabilities of the application's AI-powered video analysis and code generation workflows.