A Python application that can listen to YouTube videos or livestreams and transcribe the audio content, saving the results as Word documents.
- 🎥 Extract audio from YouTube videos and livestreams
- 🎵 Convert audio to text using Google Speech Recognition
- 📄 Save transcriptions as Word documents (.docx)
- 🌐 Web-based interface using Streamlit
- ⚡ Fast and efficient processing
- Clone or download this repository
- Install the required dependencies:
pip install -r requirements.txt- Run the application:
streamlit run main.py- Open your web browser and navigate to the Streamlit interface
- Enter a YouTube URL (video or livestream)
- Click "Transcribe" to start the process
- Download the generated Word document
- Python 3.7+
- Internet connection (for Google Speech Recognition)
- FFmpeg (for audio processing)
macOS:
brew install ffmpegWindows: Download from https://ffmpeg.org/download.html
Linux:
sudo apt update
sudo apt install ffmpeg- Audio Extraction: Uses
yt-dlpto download audio from YouTube videos - Audio Processing: Converts audio to WAV format using
pydub - Speech Recognition: Transcribes audio using Google's Speech Recognition API
- Document Generation: Creates Word documents using
python-docx
- YouTube videos (all formats)
- YouTube livestreams
- Various audio formats (automatically converted)
- Requires internet connection for speech recognition
- Audio quality affects transcription accuracy
- Processing time depends on video length
- Google Speech Recognition has usage limits
-
"Could not understand the audio"
- Try a video with clearer audio
- Check if the video has speech content
-
"Error extracting audio"
- Verify the YouTube URL is correct
- Check internet connection
- Ensure FFmpeg is installed
-
"Error with speech recognition service"
- Check internet connection
- Google's service might be temporarily unavailable
Feel free to submit issues and enhancement requests!
This project is open source and available under the MIT License.