Skip to content

feat: Gemini agentic video analysis with Google Search grounding#47

Merged
groupthinking merged 1 commit intomainfrom
fix/video-workflow-transcript
Feb 28, 2026
Merged

feat: Gemini agentic video analysis with Google Search grounding#47
groupthinking merged 1 commit intomainfrom
fix/video-workflow-transcript

Conversation

@groupthinking
Copy link
Owner

What

Implements the UVAI PK=998 pattern: uses Gemini + googleSearch tool as the primary video analysis strategy. One single API call handles both transcription AND event extraction.

Changes

  • gemini-video-analyzer.ts (NEW): Agentic engine using systemInstruction with Think→Act→Observe→Map loop, responseSchema for structured output, googleSearch grounding to retrieve transcripts/descriptions/chapters from YouTube
  • youtube-metadata.ts (NEW): Scrapes YouTube page for title, description, chapters, channel — no API key needed
  • /api/video: Strategy 2 = Gemini agentic (primary), Strategy 3 = transcribe→extract chain (fallback)
  • /api/transcribe: Removed broken fileData.fileUri approach, Gemini Google Search grounding is now primary, OpenAI web search is fallback with garbage detection
  • /api/extract-events: Now accepts videoUrl without requiring transcript — direct Gemini analysis via Google Search

Why

The old pipeline was broken:

  1. fileData.fileUri doesn't work with YouTube URLs (expects gs:// URIs)
  2. OpenAI web search returned garbage like "click Show Transcript" instead of actual content
  3. extract-events required a transcript string, failing when none was available

Testing

  • TypeScript compiles clean
  • Tested with real YouTube URLs — OpenAI fallback returns actual content (not garbage)
  • Garbage detection filter rejects results containing "click Show Transcript" etc.

⚠️ Note

Both GEMINI_API_KEY and GOOGLE_API_KEY are currently expired. The Gemini agentic strategy will activate once keys are renewed at https://aistudio.google.com/apikey. The fallback chain (OpenAI) works correctly in the meantime.

…nding

- Create gemini-video-analyzer.ts: single Gemini call with googleSearch
  tool for transcript extraction AND event analysis (PK=998 pattern)
- Add youtube-metadata.ts: scrapes title, description, chapters from
  YouTube without API key
- Update /api/video: Gemini agentic analysis as primary strategy,
  transcribe→extract chain as fallback
- Fix /api/transcribe: remove broken fileData.fileUri, use Gemini
  Google Search grounding as primary, add metadata context, filter
  garbage OpenAI results
- Fix /api/extract-events: accept videoUrl without requiring transcript,
  direct Gemini analysis via Google Search when no transcript available

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings February 28, 2026 21:20
@vercel
Copy link
Contributor

vercel bot commented Feb 28, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
v0-uvai Building Building Preview, Comment, Open in v0 Feb 28, 2026 9:20pm

@groupthinking groupthinking merged commit d0cacd1 into main Feb 28, 2026
12 of 15 checks passed
@groupthinking groupthinking deleted the fix/video-workflow-transcript branch February 28, 2026 21:20
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the video analysis capabilities by integrating a new Gemini-powered agentic approach. The core change shifts the primary video processing strategy to leverage Gemini's Google Search tool for direct content extraction, bypassing previous limitations with transcript availability and unreliable web search results. This update aims to provide more robust and accurate video insights by directly grounding the AI analysis with comprehensive YouTube metadata.

Highlights

  • Gemini Agentic Engine: A new agentic engine (gemini-video-analyzer.ts) was introduced, utilizing Gemini with systemInstruction, a Think→Act→Observe→Map loop, responseSchema for structured output, and googleSearch grounding to retrieve YouTube transcripts, descriptions, and chapters.
  • YouTube Metadata Scraping: A new utility (youtube-metadata.ts) was added to scrape YouTube pages for video title, description, chapters, and channel information without requiring an API key.
  • Primary Video Analysis Strategy: The /api/video endpoint now uses the Gemini agentic analysis as its primary strategy, with the existing transcribe→extract chain serving as a fallback.
  • Improved Transcription Endpoint: The /api/transcribe endpoint was updated to remove the broken fileData.fileUri approach, making Gemini Google Search grounding the primary method and improving the OpenAI web search fallback with garbage detection.
  • Flexible Event Extraction: The /api/extract-events endpoint now accepts a videoUrl directly, enabling Gemini to perform analysis without a pre-existing transcript.
Changelog
  • apps/web/src/app/api/extract-events/route.ts
    • Modified the endpoint to accept either a transcript string or a video URL, enabling direct Gemini analysis.
    • Implemented a new logic path for Gemini to perform direct video analysis using its googleSearch tool when a video URL is provided without a transcript.
    • Updated the error message for missing input to reflect the new videoUrl option.
    • Adjusted the final error message to emphasize the requirement for GEMINI_API_KEY.
  • apps/web/src/app/api/transcribe/route.ts
    • Imported fetchYouTubeMetadata and formatMetadataAsContext for enhanced video information retrieval.
    • Introduced a step to fetch YouTube metadata (title, description, chapters) for a given URL.
    • Replaced the previous Gemini fileData.fileUri strategy with a new Gemini Google Search grounding approach, making it the primary method for YouTube transcription.
    • Enhanced the OpenAI web search fallback by incorporating fetched YouTube metadata into the prompt and adding logic to detect and reject 'garbage' responses (e.g., instructions on how to find a transcript).
  • apps/web/src/app/api/video/route.ts
    • Imported analyzeVideoWithGemini to enable the new agentic analysis.
    • Introduced a new 'Strategy 2: Gemini Agentic Analysis' as the primary method for processing videos, which performs a single API call for both transcription and extraction.
    • Updated the previous 'Frontend-only pipeline' to 'Strategy 3' and designated it as a fallback.
    • Adjusted event publishing and error messages to reflect the new strategy prioritization and improved error handling.
  • apps/web/src/lib/gemini-video-analyzer.ts
    • Added a new file implementing the agentic video intelligence engine using Gemini and Google Search.
    • Defined VideoAnalysisResult interface for structured output.
    • Created a responseSchema using @google/genai Type system for consistent data extraction.
    • Implemented buildSystemInstruction to guide the Gemini model through a Think→Act→Observe→Map loop for video analysis.
    • Provided analyzeVideoWithGemini function to execute the deep agentic analysis, handling transcription and extraction in one call.
  • apps/web/src/lib/youtube-metadata.ts
    • Added a new file containing utilities for scraping YouTube video metadata.
    • Implemented extractVideoId to parse video IDs from various YouTube URL formats.
    • Developed parseChapters to extract chapter timestamps and titles from video descriptions.
    • Created fetchYouTubeMetadata to scrape YouTube watch pages for title, channel, description, and chapters without an API key.
    • Provided formatMetadataAsContext to format the fetched metadata into a text block suitable for AI model input.
Activity
  • No human activity has been recorded on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly improves the video analysis pipeline by implementing an agentic approach with Gemini and Google Search grounding, introducing gemini-video-analyzer.ts and youtube-metadata.ts, and refactoring API routes for this new strategy. However, a security audit identified critical vulnerabilities, including a Server-Side Request Forgery (SSRF) in the /api/transcribe route due to an unvalidated audioUrl, multiple Prompt Injection instances in the new Gemini-based analysis logic, and a risk of Host Header Injection leading to SSRF in the /api/video route. Beyond security, a critical code issue involves the use of an incorrect Gemini model name across multiple files, which will cause the primary analysis path to fail at runtime. Additional suggestions for robustness and maintainability are also included.

const systemInstruction = buildSystemInstruction(videoUrl);

const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The model name gemini-2.5-flash is incorrect. This will cause all calls to analyzeVideoWithGemini to fail, making this new agentic feature non-functional. Please correct this to a valid model name, such as gemini-1.5-flash-latest.

Suggested change
model: 'gemini-2.5-flash',
model: 'gemini-1.5-flash-latest',

try {
const ai = getGemini();
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The model name gemini-2.5-flash appears to be incorrect and will likely cause the API call to fail. The current flash model is named gemini-1.5-flash-latest. It's advisable to use a constant for model names to ensure consistency and avoid such errors, as this typo is present in multiple files.

Suggested change
model: 'gemini-2.5-flash',
model: 'gemini-1.5-flash-latest',

],
},
],
model: 'gemini-2.5-flash',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The model name gemini-2.5-flash is not a valid model identifier and will cause this API call to fail. Please update it to a correct model name, such as gemini-1.5-flash-latest, to ensure the primary transcription strategy functions correctly.

Suggested change
model: 'gemini-2.5-flash',
model: 'gemini-1.5-flash-latest',

Comment on lines +193 to +206
contents: `${SYSTEM_PROMPT}\n\nAnalyze this YouTube video and extract structured data.
Use your Google Search tool to find the video's transcript, description, and chapter content.

Video URL: ${videoUrl}
${videoTitle ? `Video Title: ${videoTitle}` : ''}

Extract events, actions, summary, and topics from the actual video content found via search.
Respond with ONLY valid JSON matching this structure:
{
"events": [{"type": "action|topic|insight|tool|resource", "title": "...", "description": "...", "timestamp": "02:15" or null, "priority": "high|medium|low"}],
"actions": [{"title": "...", "description": "...", "category": "setup|build|deploy|learn|research|configure", "estimatedMinutes": number or null}],
"summary": "2-3 sentence summary",
"topics": ["topic1", "topic2"]
}`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

Untrusted user input (videoUrl and videoTitle) is directly embedded into the prompt for direct video analysis. This allows for prompt injection attacks that could manipulate the LLM's behavior or its use of the googleSearch tool.

Comment on lines +116 to +132
contents: `You are a video transcription assistant with access to Google Search.

For the following YouTube video, use your googleSearch tool to find the ACTUAL transcript,
description, and chapter content. The video creator often provides detailed descriptions
with chapter breakdowns — USE that metadata as high-quality structured content.

${metadataContext ? `KNOWN VIDEO METADATA:\n${metadataContext}\n` : ''}
Video URL: ${url}

INSTRUCTIONS:
1. Search for the video's transcript using Google Search.
2. If a spoken transcript is available, return it verbatim.
3. If not, reconstruct detailed content from the description, chapters, comments,
and related articles found via search.
4. Be thorough — capture ALL key points, technical details, quotes, and actionable insights.
5. Include timestamps in [MM:SS] format where possible.
6. Do NOT return generic advice like "click Show Transcript" — return actual content.`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The url parameter and metadata fetched from YouTube (which can be attacker-controlled) are directly concatenated into LLM prompts. This poses a risk of prompt injection, allowing an attacker to manipulate the transcription process or the LLM's behavior.

You are the Agentic Video Intelligence Engine.

MISSION:
1. WATCH the video at ${videoUrl} by searching for its transcript, technical documentation,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The videoUrl parameter is directly embedded into the system instruction for the Gemini model. This allows for prompt injection attacks that could redefine the model's mission or rules.


const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: `Perform Agentic Grounding for Video: ${videoUrl}`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

Untrusted user input (videoUrl) is directly concatenated into the user prompt, posing a prompt injection risk.

await publishEvent(EventTypes.TRANSCRIPT_STARTED, { url, strategy: 'frontend' }, url);
await publishEvent(EventTypes.TRANSCRIPT_STARTED, { url, strategy: 'frontend-chain' }, url);
const baseUrl = getBaseUrl(request);
const transcribeRes = await fetch(`${baseUrl}/api/transcribe`, {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The getBaseUrl function derives the base URL for internal API calls from the request.url, which is influenced by the user-controlled Host header. An attacker can manipulate the Host header to redirect internal fetch calls to an arbitrary external server, potentially leading to SSRF or the exfiltration of sensitive data (like the url or transcript).

Comment on lines +179 to +182
const isGarbage = text.toLowerCase().includes('click show transcript') ||
text.toLowerCase().includes('click on the three dots') ||
text.toLowerCase().includes('steps to find') ||
(text.length < 300 && text.includes('transcript'));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic to detect 'garbage' responses is a great addition for robustness. However, the current implementation is a bit difficult to read and maintain as a single long boolean expression. Refactoring this to use an array of substrings would make it cleaner and easier to update in the future.

        const garbageSubstrings = [
          'click show transcript',
          'click on the three dots',
          'steps to find',
        ];
        const lowerCaseText = text.toLowerCase();
        const isGarbage = garbageSubstrings.some(s => lowerCaseText.includes(s)) ||
          (text.length < 300 && lowerCaseText.includes('transcript'));

// Use trusted backend origin instead of deriving from potentially user-controlled request data
const origin = BACKEND_URL;
return NextResponse.json({
id: `vid_${Date.now().toString(36)}`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using Date.now().toString(36) for generating an ID is not guaranteed to be unique, which could lead to collisions if the endpoint is called in rapid succession. For generating unique identifiers, it's more robust to use crypto.randomUUID(), which is already used elsewhere in the project for CloudEvents.

Suggested change
id: `vid_${Date.now().toString(36)}`,
id: `vid_${crypto.randomUUID()}`,

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements a Gemini-powered agentic video analysis pipeline as the primary frontend strategy. When the Python backend is unavailable, the system now calls Gemini with Google Search grounding in a single API call that handles both transcription and event extraction, replacing a broken fileData.fileUri approach. OpenAI web search is demoted to a fallback with added garbage-detection filtering.

Changes:

  • New gemini-video-analyzer.ts: Agentic engine using Gemini + Google Search to analyze YouTube videos in one API call.
  • New youtube-metadata.ts: Scrapes YouTube watch pages for title, description, and chapters without an API key.
  • Updated /api/video/route.ts, /api/transcribe/route.ts, /api/extract-events/route.ts: Promotes Gemini as primary strategy, adds garbage detection for OpenAI fallback, relaxes transcript requirement in extract-events.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
apps/web/src/lib/gemini-video-analyzer.ts New agentic engine: single Gemini + Google Search call for video analysis
apps/web/src/lib/youtube-metadata.ts New YouTube page scraper for metadata (title, description, chapters)
apps/web/src/app/api/video/route.ts Inserts Gemini agentic call as Strategy 2 before the transcribe→extract chain
apps/web/src/app/api/transcribe/route.ts Swaps strategy order: Gemini promoted to primary, OpenAI to fallback; adds metadata context
apps/web/src/app/api/extract-events/route.ts Accepts videoUrl without transcript; adds direct Gemini grounding path

success: hasResults,
insights: {
summary: extraction.summary || (hasResults ? 'Transcript extracted successfully' : 'Could not extract transcript — configure OPENAI_API_KEY or GEMINI_API_KEY'),
summary: extraction.summary || (hasResults ? 'Transcript extracted successfully' : 'Could not extract transcript — configure GEMINI_API_KEY'),
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GEMINI_API_KEY environment variable is now the primary/required key for all frontend strategies (Strategy 2 in /api/video, primary strategy in /api/transcribe, and fallback in /api/extract-events), but it is absent from apps/web/.env.example. New developers or those setting up the environment won't know they need to set it, which will silently degrade the primary pipeline to the fallback chain. GEMINI_API_KEY should be added to .env.example with an appropriate placeholder and comment.

Copilot uses AI. Check for mistakes.
return NextResponse.json({
success: false,
error: 'No AI API key configured. Set OPENAI_API_KEY or GEMINI_API_KEY.',
error: 'No AI API key configured or all extraction attempts failed. Set GEMINI_API_KEY.',
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message says "Set GEMINI_API_KEY", but /api/extract-events still supports OPENAI_API_KEY as a working provider for the transcript-based path. A user who only has OPENAI_API_KEY configured and provides a videoUrl without a transcript will see this misleading error. The message should mention both API keys.

Suggested change
error: 'No AI API key configured or all extraction attempts failed. Set GEMINI_API_KEY.',
error: 'No AI API key configured or all extraction attempts failed. Set GEMINI_API_KEY and/or OPENAI_API_KEY.',

Copilot uses AI. Check for mistakes.
Comment on lines +119 to +123
if (process.env.GEMINI_API_KEY) {
try {
await publishEvent(EventTypes.TRANSCRIPT_STARTED, { url, strategy: 'gemini-agentic' }, url);
const startTime = Date.now();
const analysis = await analyzeVideoWithGemini(url, process.env.GEMINI_API_KEY);
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The analyzeVideoWithGemini call has no timeout. Unlike Strategy 1 (which wraps the backend fetch in a 15-second AbortController timeout), the Gemini agentic call can take arbitrarily long — especially because it involves multiple internal Google Search round-trips. Vercel serverless functions have execution limits (typically 10-60 seconds depending on the plan). If the Gemini call runs long, the serverless function will be killed mid-execution with a 504/FUNCTION_INVOCATION_TIMEOUT error, rather than gracefully falling back to Strategy 3. A timeout wrapping this call (e.g., Promise.race with an AbortSignal-based timeout) should be added so the fallback chain is triggered cleanly instead.

Copilot uses AI. Check for mistakes.
import OpenAI from 'openai';
import { GoogleGenAI } from '@google/genai';
import { NextResponse } from 'next/server';
import { fetchYouTubeMetadata, formatMetadataAsContext } from '@/lib/youtube-metadata';
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JSDoc comment for POST /api/transcribe still lists the strategy order as it existed before this PR: "2. OpenAI Responses API with web_search" and "3. Gemini fallback". After this change, the order is reversed — Gemini is now strategy 2 (primary) and OpenAI is strategy 3 (fallback). The comment is now incorrect and will mislead future developers.

Copilot uses AI. Check for mistakes.
transcript_source: transcriptSource,
agents_used: ['frontend-pipeline'],
errors: hasResults ? [] : ['Backend unavailable and transcript extraction failed'],
errors: hasResults ? [] : ['All strategies failed — ensure GEMINI_API_KEY is set'],
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message says "ensure GEMINI_API_KEY is set", but the Strategy 3 fallback chain (/api/transcribe/api/extract-events) also uses OPENAI_API_KEY as a valid fallback provider. A user who only has OPENAI_API_KEY set will see this message even though their setup is partially functional for this fallback path. The message should mention both keys: e.g. "All strategies failed — ensure GEMINI_API_KEY or OPENAI_API_KEY is set".

Copilot uses AI. Check for mistakes.
Comment on lines +54 to +98
* Fetch YouTube video metadata by scraping the watch page.
* No API key required.
*/
export async function fetchYouTubeMetadata(url: string): Promise<YouTubeMetadata | null> {
const videoId = extractVideoId(url);
if (!videoId) return null;

try {
const controller = new AbortController();
const timeout = setTimeout(() => controller.abort(), 10_000);

const response = await fetch(`https://www.youtube.com/watch?v=${videoId}`, {
headers: {
'User-Agent': 'Mozilla/5.0 (compatible; EventRelay/2.0)',
'Accept-Language': 'en-US,en;q=0.9',
},
signal: controller.signal,
}).finally(() => clearTimeout(timeout));

if (!response.ok) return null;

const html = await response.text();

// Extract title from og:title
const titleMatch = html.match(/<meta property="og:title" content="([^"]+)"/);
const title = titleMatch?.[1] || '';

// Extract shortDescription from embedded JSON (contains full description)
let description = '';
const descMatch = html.match(/"shortDescription":"((?:[^"\\]|\\.)*)"/);
if (descMatch) {
try {
description = JSON.parse(`"${descMatch[1]}"`);
} catch {
description = descMatch[1].replace(/\\n/g, '\n').replace(/\\"/g, '"');
}
} else {
// Fallback to og:description (truncated)
const ogDesc = html.match(/<meta property="og:description" content="([^"]+)"/);
description = ogDesc?.[1] || '';
}

// Extract channel name
const channelMatch = html.match(/"ownerChannelName":"([^"]+)"/);
const channel = channelMatch?.[1] || '';
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scraping youtube.com watch pages from a Next.js serverless function is fragile and legally ambiguous. YouTube actively blocks server-side scrapers (bot detection, rate limiting, potential 429s or redirects), and their Terms of Service prohibit scraping without authorization. This approach will likely produce inconsistent results in production — the page structure can change without notice and YouTube's anti-bot measures will interfere. Additionally, parsing embedded JSON from the page ("shortDescription":"...") is brittle and subject to breakage whenever YouTube updates its page structure. Consider using the official YouTube Data API v3 (which provides title, description, and chapters via the videos endpoint) instead of scraping.

Suggested change
* Fetch YouTube video metadata by scraping the watch page.
* No API key required.
*/
export async function fetchYouTubeMetadata(url: string): Promise<YouTubeMetadata | null> {
const videoId = extractVideoId(url);
if (!videoId) return null;
try {
const controller = new AbortController();
const timeout = setTimeout(() => controller.abort(), 10_000);
const response = await fetch(`https://www.youtube.com/watch?v=${videoId}`, {
headers: {
'User-Agent': 'Mozilla/5.0 (compatible; EventRelay/2.0)',
'Accept-Language': 'en-US,en;q=0.9',
},
signal: controller.signal,
}).finally(() => clearTimeout(timeout));
if (!response.ok) return null;
const html = await response.text();
// Extract title from og:title
const titleMatch = html.match(/<meta property="og:title" content="([^"]+)"/);
const title = titleMatch?.[1] || '';
// Extract shortDescription from embedded JSON (contains full description)
let description = '';
const descMatch = html.match(/"shortDescription":"((?:[^"\\]|\\.)*)"/);
if (descMatch) {
try {
description = JSON.parse(`"${descMatch[1]}"`);
} catch {
description = descMatch[1].replace(/\\n/g, '\n').replace(/\\"/g, '"');
}
} else {
// Fallback to og:description (truncated)
const ogDesc = html.match(/<meta property="og:description" content="([^"]+)"/);
description = ogDesc?.[1] || '';
}
// Extract channel name
const channelMatch = html.match(/"ownerChannelName":"([^"]+)"/);
const channel = channelMatch?.[1] || '';
* Fetch YouTube video metadata using the official YouTube Data API v3.
*
* Requires a server-side API key exposed as `YOUTUBE_API_KEY` in the environment.
*/
export async function fetchYouTubeMetadata(url: string): Promise<YouTubeMetadata | null> {
const videoId = extractVideoId(url);
if (!videoId) return null;
const apiKey = process.env.YOUTUBE_API_KEY;
if (!apiKey) {
console.warn('[YouTube] YOUTUBE_API_KEY is not configured; cannot fetch metadata.');
return null;
}
try {
const controller = new AbortController();
const timeout = setTimeout(() => controller.abort(), 10_000);
const response = await fetch(
`https://www.googleapis.com/youtube/v3/videos?part=snippet&id=${encodeURIComponent(
videoId,
)}&key=${encodeURIComponent(apiKey)}`,
{
signal: controller.signal,
},
).finally(() => clearTimeout(timeout));
if (!response.ok) {
console.warn('[YouTube] API response not ok:', response.status, response.statusText);
return null;
}
const data = await response.json();
if (!data.items || !Array.isArray(data.items) || data.items.length === 0) {
return null;
}
const snippet = data.items[0].snippet;
if (!snippet) {
return null;
}
const title: string = snippet.title ?? '';
const channel: string = snippet.channelTitle ?? '';
const description: string = snippet.description ?? '';

Copilot uses AI. Check for mistakes.
config: {
systemInstruction,
responseMimeType: 'application/json',
responseSchema,
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gemini does not support combining responseSchema (structured output) with grounding tools like googleSearch in the same request. The Gemini API documentation explicitly states these features are mutually exclusive — using both will result in a runtime API error. The analyzeVideoWithGemini function uses both responseSchema and tools: [{ googleSearch: {} }] together. The same conflict exists in extract-events/route.ts at the direct video analysis block (line 188–219) and in transcribe/route.ts when Google Search grounding is used (though that call does not set responseSchema, only transcribe avoids this). The fix is to choose one: either use responseSchema for structured output (without grounding), or use googleSearch grounding (without responseSchema) and parse the text output manually.

Suggested change
responseSchema,

Copilot uses AI. Check for mistakes.
Comment on lines +209 to +210
responseMimeType: 'application/json',
responseSchema: geminiResponseSchema,
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same incompatibility exists here: responseMimeType: 'application/json', responseSchema: geminiResponseSchema, and tools: [{ googleSearch: {} }] are all set in the same call. Gemini does not allow combining structured JSON output (responseSchema) with grounding tools (googleSearch) in the same request — this will cause a runtime API error. Either remove responseSchema/responseMimeType and parse the free-text response, or remove googleSearch and supply the transcript text directly.

Suggested change
responseMimeType: 'application/json',
responseSchema: geminiResponseSchema,

Copilot uses AI. Check for mistakes.
});

const resultText = response.text || '{}';
return JSON.parse(resultText) as VideoAnalysisResult;
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When response.text is an empty string or undefined and falls back to '{}', JSON.parse('{}') succeeds but returns an empty object. This is then cast (without any runtime validation) to VideoAnalysisResult, meaning all required fields (title, summary, transcript, events, actions, topics, architectureCode, ingestScript) will be undefined at runtime, even though TypeScript believes they are present. Callers in video/route.ts access analysis.transcript?.length, analysis.events?.length, etc. with optional chaining, which masks the problem, but it means Strategy 2 silently "succeeds" with an empty result and returns a response with status: 'complete' and all empty data. A defensive check (e.g., verifying that analysis.title or analysis.events is non-empty) should be added before returning the success response.

Suggested change
return JSON.parse(resultText) as VideoAnalysisResult;
let parsed: unknown;
try {
parsed = JSON.parse(resultText);
} catch (err) {
throw new Error('Failed to parse Gemini video analysis response as JSON');
}
const candidate = parsed as Partial<VideoAnalysisResult> | null;
const hasTitle =
!!candidate &&
typeof candidate.title === 'string' &&
candidate.title.trim().length > 0;
const hasSummary =
!!candidate &&
typeof candidate.summary === 'string' &&
candidate.summary.trim().length > 0;
const hasEvents =
!!candidate && Array.isArray(candidate.events) && candidate.events.length > 0;
if (!hasTitle || !hasSummary || !hasEvents) {
throw new Error('Gemini video analysis returned an empty or invalid result');
}
return candidate as VideoAnalysisResult;

Copilot uses AI. Check for mistakes.
groupthinking added a commit that referenced this pull request Mar 4, 2026
…51)

* feat: Initialize PGLite v17 database data files for the dataconnect project.

* feat: enable automatic outline generation for Gemini Code Assist in VS Code settings.

* feat: Add NotebookLM integration with a new processor and `analyze_video_with_notebooklm` MCP tool.

* feat: Add NotebookLM profile data and an ingestion test.

* chore: Update and add generated browser profile files for notebooklm development.

* Update `notebooklm_chrome_profile` internal state and add architectural context documentation and video asset.

* feat: Add various knowledge prototypes for MCP servers and universal automation, archive numerous scripts and documentation, and update local browser profile data.

* chore: Add generated browser profile cache and data for notebooklm.

* Update notebooklm Chrome profile preferences, cache, and session data.

* feat: Update NotebookLM Chrome profile with new cache, preferences, and service worker data.

* feat: Add generated Chrome profile cache and code cache files and update associated profile data.

* Update `notebooklm` Chrome profile cache, code cache, GPU cache, and safe browsing data.

* chore(deps): bump the npm_and_yarn group across 4 directories with 5 updates

Bumps the npm_and_yarn group with 3 updates in the / directory: [ajv](https://github.com/ajv-validator/ajv), [hono](https://github.com/honojs/hono) and [qs](https://github.com/ljharb/qs).
Bumps the npm_and_yarn group with 3 updates in the /docs/knowledge_prototypes/mcp-servers/fetch-mcp directory: [@modelcontextprotocol/sdk](https://github.com/modelcontextprotocol/typescript-sdk), [ajv](https://github.com/ajv-validator/ajv) and [hono](https://github.com/honojs/hono).
Bumps the npm_and_yarn group with 1 update in the /scripts/archive/software-on-demand directory: [ajv](https://github.com/ajv-validator/ajv).
Bumps the npm_and_yarn group with 2 updates in the /scripts/archive/supabase_cleanup directory: [next](https://github.com/vercel/next.js) and [qs](https://github.com/ljharb/qs).


Updates `ajv` from 8.17.1 to 8.18.0
- [Release notes](https://github.com/ajv-validator/ajv/releases)
- [Commits](ajv-validator/ajv@v8.17.1...v8.18.0)

Updates `hono` from 4.11.7 to 4.12.1
- [Release notes](https://github.com/honojs/hono/releases)
- [Commits](honojs/hono@v4.11.7...v4.12.1)

Updates `qs` from 6.14.1 to 6.15.0
- [Changelog](https://github.com/ljharb/qs/blob/main/CHANGELOG.md)
- [Commits](ljharb/qs@v6.14.1...v6.15.0)

Updates `@modelcontextprotocol/sdk` from 1.25.2 to 1.26.0
- [Release notes](https://github.com/modelcontextprotocol/typescript-sdk/releases)
- [Commits](modelcontextprotocol/typescript-sdk@v1.25.2...v1.26.0)

Updates `ajv` from 8.17.1 to 8.18.0
- [Release notes](https://github.com/ajv-validator/ajv/releases)
- [Commits](ajv-validator/ajv@v8.17.1...v8.18.0)

Updates `hono` from 4.11.5 to 4.12.1
- [Release notes](https://github.com/honojs/hono/releases)
- [Commits](honojs/hono@v4.11.7...v4.12.1)

Updates `qs` from 6.14.1 to 6.15.0
- [Changelog](https://github.com/ljharb/qs/blob/main/CHANGELOG.md)
- [Commits](ljharb/qs@v6.14.1...v6.15.0)

Updates `ajv` from 8.17.1 to 8.18.0
- [Release notes](https://github.com/ajv-validator/ajv/releases)
- [Commits](ajv-validator/ajv@v8.17.1...v8.18.0)

Updates `next` from 15.4.10 to 15.5.10
- [Release notes](https://github.com/vercel/next.js/releases)
- [Changelog](https://github.com/vercel/next.js/blob/canary/release.js)
- [Commits](vercel/next.js@v15.4.10...v15.5.10)

Updates `qs` from 6.14.1 to 6.15.0
- [Changelog](https://github.com/ljharb/qs/blob/main/CHANGELOG.md)
- [Commits](ljharb/qs@v6.14.1...v6.15.0)

---
updated-dependencies:
- dependency-name: ajv
  dependency-version: 8.18.0
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: hono
  dependency-version: 4.12.1
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: qs
  dependency-version: 6.15.0
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: "@modelcontextprotocol/sdk"
  dependency-version: 1.26.0
  dependency-type: direct:production
  dependency-group: npm_and_yarn
- dependency-name: ajv
  dependency-version: 8.18.0
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: hono
  dependency-version: 4.12.1
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: qs
  dependency-version: 6.15.0
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: ajv
  dependency-version: 8.18.0
  dependency-type: direct:production
  dependency-group: npm_and_yarn
- dependency-name: next
  dependency-version: 15.5.10
  dependency-type: direct:production
  dependency-group: npm_and_yarn
- dependency-name: qs
  dependency-version: 6.15.0
  dependency-type: indirect
  dependency-group: npm_and_yarn
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore(deps): bump minimatch

Bumps the npm_and_yarn group with 1 update in the /scripts/archive/supabase_cleanup directory: [minimatch](https://github.com/isaacs/minimatch).


Updates `minimatch` from 3.1.2 to 3.1.4
- [Changelog](https://github.com/isaacs/minimatch/blob/main/changelog.md)
- [Commits](isaacs/minimatch@v3.1.2...v3.1.4)

---
updated-dependencies:
- dependency-name: minimatch
  dependency-version: 3.1.4
  dependency-type: indirect
  dependency-group: npm_and_yarn
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore(deps): bump the npm_and_yarn group across 2 directories with 1 update

Bumps the npm_and_yarn group with 1 update in the / directory: [hono](https://github.com/honojs/hono).
Bumps the npm_and_yarn group with 1 update in the /docs/knowledge_prototypes/mcp-servers/fetch-mcp directory: [hono](https://github.com/honojs/hono).


Updates `hono` from 4.12.1 to 4.12.2
- [Release notes](https://github.com/honojs/hono/releases)
- [Commits](honojs/hono@v4.12.1...v4.12.2)

Updates `hono` from 4.12.1 to 4.12.2
- [Release notes](https://github.com/honojs/hono/releases)
- [Commits](honojs/hono@v4.12.1...v4.12.2)

---
updated-dependencies:
- dependency-name: hono
  dependency-version: 4.12.2
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: hono
  dependency-version: 4.12.2
  dependency-type: indirect
  dependency-group: npm_and_yarn
...

Signed-off-by: dependabot[bot] <support@github.com>

* feat: enable frontend-only video ingestion pipeline for Vercel deployment

The core pipeline previously required the Python backend to be running.
When deployed to Vercel (https://v0-uvai.vercel.app/), the backend is
unavailable, causing all video analysis to fail immediately.

Changes:
- /api/video: Falls back to frontend-only pipeline (transcribe + extract)
  when the Python backend is unreachable, with 15s timeout
- /api/transcribe: Adds Gemini fallback when OpenAI is unavailable, plus
  8s timeout on backend probe to avoid hanging on Vercel
- layout.tsx: Loads Google Fonts via <link> instead of next/font/google
  to avoid build failures in offline/sandboxed CI environments
- page.tsx: Replace example URLs with technical content (3Blue1Brown
  neural networks, Karpathy LLM intro) instead of rick roll / zoo videos
- gemini_service.py: Gate Vertex AI import behind GOOGLE_CLOUD_PROJECT
  env var to prevent 30s+ hangs on the GCE metadata probe
- agent_gap_analyzer.py: Fix f-string backslash syntax errors (Python 3.11)

https://claude.ai/code/session_015Pd3a6hinTenCNrPRGiZqE

* Potential fix for code scanning alert no. 4518: Server-side request forgery

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* Initial plan

* Potential fix for code scanning alert no. 4517: Server-side request forgery

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* Initial plan

* Fix review feedback: timeout cleanup, transcript_segments shape, ENABLE_VERTEX_AI boolean parsing

Co-authored-by: groupthinking <154503486+groupthinking@users.noreply.github.com>

* fix: clearTimeout in finally blocks, transcript_segments shape, ENABLE_VERTEX_AI boolean parsing

Co-authored-by: groupthinking <154503486+groupthinking@users.noreply.github.com>

* Update src/youtube_extension/services/ai/gemini_service.py

Co-authored-by: vercel[bot] <35613825+vercel[bot]@users.noreply.github.com>

* Update apps/web/src/app/api/video/route.ts

Co-authored-by: vercel[bot] <35613825+vercel[bot]@users.noreply.github.com>

* Update apps/web/src/app/api/video/route.ts

Co-authored-by: vercel[bot] <35613825+vercel[bot]@users.noreply.github.com>

* Initial plan

* Initial plan

* Fix: move clearTimeout into .finally() to prevent timer leaks on fetch abort/error

Co-authored-by: groupthinking <154503486+groupthinking@users.noreply.github.com>

* Fix clearTimeout not called in finally blocks for AbortController timeouts

Co-authored-by: groupthinking <154503486+groupthinking@users.noreply.github.com>

* Fix: Relative URLs in server-side fetch calls fail in production - fetch('/api/transcribe') and fetch('/api/extract-events') use relative URLs which don't resolve correctly in server-side Next.js code on production deployments like Vercel.

This commit fixes the issue reported at apps/web/src/app/api/video/route.ts:101

## Bug Analysis

**Why it happens:**
In Next.js API routes running on the server (Node.js runtime), the `fetch()` API requires absolute URLs. Unlike browsers which have an implicit base URL (the current origin), server-side code has no context for resolving relative URLs like `/api/transcribe`. The Node.js fetch implementation will fail to resolve these relative paths, resulting in TypeError or connection errors.

**When it manifests:**
- **Development (localhost:3000)**: Works accidentally because the request URL contains the host
- **Production (Vercel)**: Fails because the relative URL cannot be resolved to a valid absolute URL without proper host context

**What impact it has:**
The frontend-only pipeline fallback (Strategy 2) in lines 101-132 is completely broken in production. When the backend is unavailable (common on Vercel), the code attempts to use `/api/transcribe` and `/api/extract-events` serverless functions but fails due to unresolvable relative URLs. This causes the entire video analysis endpoint to fail when the backend is unavailable.

## Fix Explanation

**Changes made:**
1. Added a `getBaseUrl(request: Request)` helper function that extracts the absolute base URL from the incoming request object using `new URL(request.url)`
2. Updated line 108: `fetch('/api/transcribe', ...)` → `fetch(`${baseUrl}/api/transcribe`, ...)`
3. Updated line 127: `fetch('/api/extract-events', ...)` → `fetch(`${baseUrl}/api/extract-events`, ...)`

**Why it solves the issue:**
- The incoming `request` object contains the full URL including protocol and host
- By constructing an absolute URL from the request, we ensure the fetch calls work in both development and production
- This approach is more reliable than environment variables because it uses the actual request context, handling reverse proxies and different deployment configurations correctly

Co-authored-by: Vercel <vercel[bot]@users.noreply.github.com>
Co-authored-by: groupthinking <garveyht@gmail.com>

* Initial plan

* chore(deps): bump the npm_and_yarn group across 1 directory with 1 update

Bumps the npm_and_yarn group with 1 update in the /docs/knowledge_prototypes/mcp-servers/fetch-mcp directory: [minimatch](https://github.com/isaacs/minimatch).


Updates `minimatch` from 3.1.2 to 3.1.5
- [Changelog](https://github.com/isaacs/minimatch/blob/main/changelog.md)
- [Commits](isaacs/minimatch@v3.1.2...v3.1.5)

Updates `minimatch` from 5.1.6 to 5.1.9
- [Changelog](https://github.com/isaacs/minimatch/blob/main/changelog.md)
- [Commits](isaacs/minimatch@v3.1.2...v3.1.5)

---
updated-dependencies:
- dependency-name: minimatch
  dependency-version: 3.1.5
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: minimatch
  dependency-version: 5.1.9
  dependency-type: indirect
  dependency-group: npm_and_yarn
...

Signed-off-by: dependabot[bot] <support@github.com>

* fix: validate BACKEND_URL before using it

Skip backend calls entirely when BACKEND_URL is not configured or
contains an invalid value (like a literal ${...} template string).
This prevents URL parse errors on Vercel where the env var may not
be set.

https://claude.ai/code/session_015Pd3a6hinTenCNrPRGiZqE

* fix: resolve embeddings package build errors (#41)

- Create stub types for Firebase Data Connect SDK in src/dataconnect-generated/
- Fix import path from ../dataconnect-generated to ./dataconnect-generated (rootDir constraint)
- Add explicit type assertions for JSON responses (predictions, access_token)
- All 6 TypeScript errors resolved, clean build verified

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat: Gemini SDK upgrade + VideoPack schema alignment (#43)

* chore: Update generated Chrome profile cache and session data for notebooklm.

* chore: refresh notebooklm Chrome profile data, including Safe Browsing lists, caches, and session files.

* Update local application cache and database files within the NotebookLM Chrome profile.

* chore: update Chrome profile cache and Safe Browsing data files.

* feat: upgrade Gemini to @google/genai SDK with structured output, search grounding, video URL processing, and extend VideoPack schema

- Upgrade extract-events/route.ts from @google/generative-ai to @google/genai
- Add Gemini responseSchema with Type system for structured output enforcement
- Add Google Search grounding (googleSearch tool) to Gemini calls
- Upgrade transcribe/route.ts to @google/genai with direct YouTube URL processing via fileData
- Add Gemini video URL fallback chain: direct video → text+search → other strategies
- Extend VideoPackV0 schema with Chapter, CodeCue, Task models
- Update versioning shim for new fields
- Export new types from videopack __init__

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat: wire CloudEvents pipeline + Chrome Built-in AI fallback (#44)

- Add TypeScript CloudEvents publisher (apps/web/src/lib/cloudevents.ts)
  emitting standardized events at each video processing stage
- Wire CloudEvents into /api/video route (both backend + frontend strategies)
- Wire CloudEvents into FastAPI backend router (process_video_v1 endpoint)
- Add Chrome Built-in AI service (Prompt API + Summarizer API) for
  on-device client-side transcript analysis when API keys are unavailable
- Add useBuiltInAI React hook for component integration
- Add .next/ to .gitignore

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat: wire A2A inter-agent messaging into orchestrator + API (#45)

- Add A2AContextMessage dataclass to AgentOrchestrator for lightweight
  inter-agent context sharing during parallel task execution
- Auto-broadcast agent results to peer agents after parallel execution
- Add send_a2a_message() and get_a2a_log() methods to orchestrator
- Add POST /api/v1/agents/a2a/send endpoint for frontend-to-agent messaging
- Add GET /api/v1/agents/a2a/log endpoint to query message history
- Extend frontend agentService with sendA2AMessage() and getA2ALog()

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat: add LiteRT-LM setup script and update README (#46)

- Add setup.sh to download lit CLI binary and .litertlm model
- Support macOS arm64 and x86_64 architectures
- Auto-generate .env with LIT_BINARY_PATH and LIT_MODEL_PATH
- Add .gitignore for bin/, models/, .env
- Update README with Quick Setup section

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat: implement Gemini agentic video analysis with Google Search grounding (#47)

- Create gemini-video-analyzer.ts: single Gemini call with googleSearch
  tool for transcript extraction AND event analysis (PK=998 pattern)
- Add youtube-metadata.ts: scrapes title, description, chapters from
  YouTube without API key
- Update /api/video: Gemini agentic analysis as primary strategy,
  transcribe→extract chain as fallback
- Fix /api/transcribe: remove broken fileData.fileUri, use Gemini
  Google Search grounding as primary, add metadata context, filter
  garbage OpenAI results
- Fix /api/extract-events: accept videoUrl without requiring transcript,
  direct Gemini analysis via Google Search when no transcript available

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: support Vertex_AI_API_KEY as Gemini key fallback

Create shared gemini-client.ts that resolves API key from:
GEMINI_API_KEY → GOOGLE_API_KEY → Vertex_AI_API_KEY

All API routes now use the shared client instead of
hardcoding process.env.GEMINI_API_KEY.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: use Vertex AI Express Mode for Vertex_AI_API_KEY

When only Vertex_AI_API_KEY is set (no GEMINI_API_KEY), the client
now initializes in Vertex AI mode with vertexai: true + apiKey.
Uses project uvai-730bb and us-central1 as defaults.

Also added GOOGLE_CLOUD_PROJECT env var to Vercel.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: Vertex AI Express Mode compatibility — remove responseSchema+googleSearch conflict (#48)

Vertex AI does not support controlled generation (responseSchema) combined
with the googleSearch tool. This caused 400 errors on every Gemini call.

Changes:
- gemini-client.ts: Prioritize Vertex_AI_API_KEY, support GOOGLE_GENAI_USE_VERTEXAI env var
- gemini-video-analyzer.ts: Remove responseSchema, enforce JSON via prompt instructions
- extract-events/route.ts: Same fix for extractWithGemini and inline Gemini calls
- Strip markdown code fences from responses before JSON parsing

Tested end-to-end with Vertex AI Express Mode key against multiple YouTube videos.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: restore full PK=998 pattern — responseSchema + googleSearch + gemini-3-pro-preview (#49)

The previous fix (PR #48) was a shortcut — it removed responseSchema when
the real issue was using gemini-2.5-flash which doesn't support
responseSchema + googleSearch together on Vertex AI.

gemini-3-pro-preview DOES support the combination. This commit restores
the exact PK=998 pattern:

- gemini-video-analyzer.ts: Restored responseSchema with Type system,
  responseMimeType, e22Snippets field, model → gemini-3-pro-preview
- extract-events/route.ts: Restored geminiResponseSchema, Type import,
  responseMimeType, model → gemini-3-pro-preview
- transcribe/route.ts: model → gemini-3-pro-preview

Tested with Vertex AI Express Mode key on two YouTube videos.
Both return structured JSON with events, transcript, actions,
codeMapping, cloudService, e22Snippets, architectureCode, ingestScript.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat: end-to-end pipeline — YouTube URL to deployed software (#50)

- Add /api/pipeline route for full end-to-end pipeline
  (video analysis → code generation → GitHub repo → Vercel deploy)
- Add deployPipeline() action to dashboard store with stage tracking
- Add 🚀 Deploy button to dashboard alongside Analyze
- Show pipeline results (live URL, GitHub repo, framework) in video cards
- Fix deployment_manager import path in video_processing_service
- Wire pipeline to backend /api/v1/video-to-software endpoint
- Fallback to Gemini-only analysis when no backend available

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: add writable directories to Docker image for deployment pipeline

Create /app/generated_projects, /app/youtube_processed_videos, and
/tmp/uvai_data directories in Dockerfile to fix permission denied
errors in the deployment and video processing pipeline on Railway.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: security hardening, video-specific codegen, API consistency

- CORS: replace wildcard/glob with explicit allowed origins in both entry points
- Rate limiting: enable 60 req/min with 15 burst on backend
- API auth: add optional X-API-Key middleware for pipeline endpoints
- Codegen: generate video-specific HTML/CSS/JS from analysis output
- API: accept both 'url' and 'video_url' via Pydantic alias
- Deploy: fix Vercel REST API payload format (gitSource instead of gitRepository)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: Vercel deployment returning empty live_url

Root causes fixed:
- Case mismatch in _poll_deployment_status: compared lowercased status
  against uppercase success_statuses list, so READY was never matched
- Vercel API returns bare domain URLs without https:// prefix; added
  _ensure_https() to normalize them
- Poll requests were missing auth headers, causing 401 failures
- _deploy_files_directly fallback returned fake simulated URLs that
  masked real failures; removed in favor of proper error reporting
- _generate_deployment_urls only returned URLs from 'success' status
  deployments, discarding useful fallback URLs from failed deployments

Improvements:
- On API failure (permissions, plan limits), return a Vercel import URL
  the user can click to deploy manually instead of an empty string
- Support VERCEL_ORG_ID team scoping on deploy and poll endpoints
- Use readyState field (Vercel v13 API) for initial status check
- Add 'canceled' to failure status list in poll loop
- Poll failures are now non-fatal; initial URL is used as fallback

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: harden slim entry point — CORS, rate limiting, auth, security headers

- Add uvaiio.vercel.app to CORS allowed origins
- Add slowapi rate limiting (60 req/min)
- Add API key auth middleware (optional via EVENTRELAY_API_KEY)
- Add security headers (X-Content-Type-Options, X-Frame-Options, X-XSS-Protection)
- Fixes production gap where slim main.py had none of the backend/main.py protections

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: resolve Pydantic Config/model_config conflict breaking Railway deploy

The VideoToSoftwareRequest model had both 'model_config = ConfigDict(...)' and
'class Config:' which Pydantic v2 rejects. Merged into single model_config.
This was causing the v1 router to fail loading, making /api/v1/health return 404.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: vercel[bot] <35613825+vercel[bot]@users.noreply.github.com>
Co-authored-by: Vercel <vercel[bot]@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants