A lightweight vanilla JavaScript implementation of the Gemini 2.0 Flash Multimodal Live API client. This project provides real-time interaction with Gemini's API through text, audio, video, and screen sharing capabilities.
This is a simplified version of Google's original React implementation, created in response to this issue.
- Real-time chat with Gemini 2.0 Flash Multimodal Live API
- Real-time audio responses from the model
- Real-time audio input from the user, allowing interruptions
- Real-time video streaming from the user's webcam
- Real-time screen sharing from the user's screen
- Function calling
- Built with vanilla JavaScript (no dependencies)
- Mobile-friendly
- Modern web browser with WebRTC, WebSocket, and Web Audio API support
- Google AI Studio API key
python -m http.server
ornpx http-server
or Live Server extension for VS Code (to host a server for index.html)
-
Get your API key from Google AI Studio
-
Clone the repository
git clone https://github.com/ViaAnthroposBenevolentia/gemini-2-live-api-demo.git
-
Start the development server (adjust port if needed):
cd gemini-2-live-api-demo python -m http.server 8000 # or npx http-server 8000 or Open with Live Server extension for VS Code
-
Access the application at
http://localhost:8000
-
Open the settings at the top right, paste your API key, and click "Save"
Contributions are welcome! Please feel free to submit issues and pull requests.
This project is licensed under the MIT License.