A full-fledged web application that records audio and classifies whether it's funny or not using a fine-tuned HuggingFace model. The app features real-time audio recording, file upload capabilities, and a modern, responsive UI.
- Real-time Audio Recording: Record audio directly in the browser
- File Upload Support: Upload audio files (.wav, .mp3, .m4a, etc.)
- AI Classification: Uses fine-tuned wav2vec model for humor detection
- Modern UI: Beautiful, responsive design with real-time feedback
- Confidence Scores: Shows detailed confidence scores for predictions
- Audio Validation: Ensures minimum/maximum audio length requirements
The app uses a fine-tuned HuggingFace model (rishiA/humor_model_v4) that achieves 86% accuracy on humor detection tasks.
- Python 3.8 or higher
- pip (Python package installer)
- Modern web browser with microphone access
-
Clone the repository
git clone <your-repo-url> cd humorMe
-
Install dependencies
pip install -r requirements.txt
-
Run the application
python app.py
-
Open your browser Navigate to
http://localhost:5000
For production deployment, you can use Gunicorn:
gunicorn -w 4 -b 0.0.0.0:5000 app:app- Click "🎤 Start Recording"
- Speak something funny or serious
- Click "⏹️ Stop Recording" when done
- Wait for the AI analysis
- Click "📁 Choose Audio File"
- Select an audio file from your device
- Wait for the AI analysis
- 😂 FUNNY: The AI detected humor in your audio
- 😐 NOT FUNNY: The AI detected serious/non-humorous content
- Confidence Score: How certain the AI is about its prediction
- Detailed Scores: Breakdown of funny vs not-funny probabilities
- Framework: Flask with CORS support
- Audio Processing: librosa for audio preprocessing
- Model Integration: HuggingFace Transformers pipeline
- File Handling: Temporary file management for audio processing
- Audio Recording: Web Audio API with MediaRecorder
- File Upload: Drag-and-drop file input
- Responsive Design: Mobile-friendly interface
- Real-time Feedback: Status updates and progress indicators
- Input Validation: Check file format and duration
- Preprocessing: Resample to 16kHz, normalize length
- Model Inference: Pass through fine-tuned wav2vec model
- Post-processing: Extract confidence scores and predictions
The app uses a fine-tuned version of Facebook's wav2vec model:
- Base Model:
facebook/hubert-large-ls960-ft - Task: Binary classification (funny vs not funny)
- Training: Custom dataset with class weighting
- Performance: 86% accuracy on test set
The UI uses CSS custom properties and can be easily customized by modifying the styles in templates/index.html.
To use a different model, update the model name in app.py:
classifier = pipeline("audio-classification", model="your-model-name")Modify audio length constraints in app.py:
# Minimum length (0.5 seconds)
if len(audio_data) < 8000:
# Maximum length (30 seconds)
if len(audio_data) > 480000:- Model Caching: Model is loaded once at startup
- Temporary Files: Automatic cleanup of processed audio files
- Audio Compression: Optimized audio preprocessing pipeline
- Async Processing: Non-blocking audio classification