A modern web application for AI-powered audio and video transcription with comprehensive speech analytics.
- AI-powered transcription using Google Gemini 1.5 Pro
- 20MB file support for most audio/video formats
- High accuracy speech-to-text with automatic punctuation
- 15+ media formats including MP4, MP3, WAV, AVI, MKV
- Automatic format conversion from video to audio
- Client-side media processing using FFmpeg.wasm
- URL/Google Drive integration for remote files
- Format optimization for better accuracy
- Filler word analysis identifying "um", "uh", and verbal pauses
- Speaking pace measurement in words per minute
- Speech clarity scoring and confidence assessment
- Engagement metrics and pace consistency tracking
- Interactive analytics dashboard with real-time charts
- Modern, responsive design inspired by Medium.com
- Drag & drop file upload with visual feedback
- Real-time transcript editing with save functionality
- Multiple export formats including TXT and full analysis reports
- Clean filler word removal toggle
- Next.js 14 - React-based full-stack framework
- React 18 - Component-based user interface
- TypeScript - Type-safe JavaScript development
- Google Gemini API - Advanced language model for transcription
- FFmpeg.wasm - Client-side media processing
- WebAssembly - High-performance audio/video manipulation