Skip to content

FEAT: Add Google Gemini Live API support alongside OpenAI#1

Open
Siyarbekir47 wants to merge 3 commits into
intellwe:masterfrom
Siyarbekir47:master
Open

FEAT: Add Google Gemini Live API support alongside OpenAI#1
Siyarbekir47 wants to merge 3 commits into
intellwe:masterfrom
Siyarbekir47:master

Conversation

@Siyarbekir47
Copy link
Copy Markdown

What this PR adds

This extends the original OpenAI-only agent to support Google Gemini Live API as a second provider, selectable at startup — no code changes needed to switch.

Changes

main.py

  • Added --openai / --gemini CLI flags (python main.py --openai)
  • Refactored into two handlers: _stream_openai() and _stream_gemini()
  • Added audio transcoding for Gemini (mulaw 8kHz ↔ PCM 16/24kHz) using audioop
  • Fixed double base64 no-op in OpenAI audio forwarding
  • Added /incoming-call endpoint for inbound call support
  • Added /recording-status endpoint for recording callbacks
  • Replaced deprecated ws.open with not ws.closed

requirements.txt

  • Added google-genai — Gemini SDK
  • Added audioop-ltsaudioop replacement for Python 3.13+
  • Removed unused openai package (raw WebSocket used instead)

.env.example

  • Added GOOGLE_API_KEY, GEMINI_MODEL, GEMINI_VOICE
  • Added OPENAI_MODEL, OPENAI_VOICE — now fully configurable
  • Added AI_PROVIDER for uvicorn direct usage

README.md

  • Full rewrite covering both providers
  • Updated voice lists, model options, architecture diagram, all env variables

Usage

python main.py --openai   # OpenAI Realtime API
python main.py --gemini   # Google Gemini Live API

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant