Skip to content

Generator: Add support for Google Gemini models #1306

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 22 commits into
base: main
Choose a base branch
from

Conversation

dchiitmalla
Copy link
Contributor

@dchiitmalla dchiitmalla commented Jul 22, 2025

Faced issues trying to run gemini models using Litellm. This generator natively supports google models using Google’s official google.generativeai library—more reliable than LiteLLM. It supports multiple models (2.5 Pro, Flash, etc.) with error handling and model-specific config options. #443

Copy link
Collaborator

@jmartin-tech jmartin-tech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working with GCP based services has shown that auth via api_key is not always as straight forward as providing a single environment variable value. Can you provide details on how the key used would be scoped or generated.

Also consider enabling the google library to attempt auth even when no key is provided.

responses = []
import logging

for _ in range(generations_this_call):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iterating generations like this is will not combine with backoff correctly. As written any raised backoff exception will throw away all completed generations and start over.

Looking at gemini docs multiple generations can be obtained in a single call using GenerateContentConfig by setting candidateCount= generations_this_call and passing generate_content() a config named parameter.

If calling for more than one generation please validate how the response object will be formatted.

Copy link

@Abhiraj-GetGarak Abhiraj-GetGarak Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed - The implementation now uses candidate_count=generations_this_call in GenerateContentConfig and makes a single API call via _generate_content_with_backoff(). This ensures backoff retries don't discard completed generations. The response format has been validated and properly handles multiple candidates from the API response.

dchiitmalla and others added 3 commits July 23, 2025 10:56
1. Update Gemini generator to handle test model names gracefully
2. Add missing documentation file for Gemini generator
3. Add Gemini generator to documentation toctree
4. Add google-generativeai dependency to pyproject.toml
remove api validation

Co-authored-by: Jeffrey Martin <[email protected]>
Signed-off-by: Divya Chitimalla <[email protected]>
- Update _call_model to use Gemini's native candidateCount parameter for multiple generations
- Process response candidates correctly to extract text from each generation
- Remove generation config from model initialization and set it per request
- Fix backoff handling to properly retry the entire batch of generations
- Ensure consistent number of responses are returned
Copy link
Collaborator

@jmartin-tech jmartin-tech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General comments, the auth concern may be reasonable to defer to require environment variable based API key for initial acceptance.

Another concern that may be important to address:
This PR uses the package google-generativeai however current docs recommend using google-genai which has the same imported package name but a different interface.


# Create the generator with a native audio model
generator = GeminiGenerator(name="gemini-2.5-flash-native-audio")
output = generator._call_model("Transcribe this text.")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not make sense the input modality suggested by this test is audio not text.

The current generator also inherits the default modality, {"in": {"text"}, "out": {"text"}} unless overridden this generator should only accept text prompts.

Copy link
Contributor

github-actions bot commented Aug 8, 2025

DCO Assistant Lite bot All contributors have signed the DCO ✍️ ✅

@Abhiraj-GetGarak
Copy link

I have read the DCO Document and I hereby sign the DCO

@Abhiraj-GetGarak
Copy link

recheck

github-actions bot added a commit that referenced this pull request Aug 8, 2025
@Abhiraj-GetGarak
Copy link

  1. Multiple generations backoff issue - Fixed using new google-genai library with candidate_count in
    single API call.

  2. Authentication flexibility - Enhanced with both API key and Vertex AI (ADC) support using new client
    architecture.

  3. Package migration completed - Successfully migrated from google-generativeai>=0.8.5 to
    google-genai>=1.0.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants