Gemini Client #5522

gziz · 2025-02-13T02:06:07Z

Why are these changes needed?

PR Message

Hi, I just read the conversation you guys @yu-iskw, @rohanthacker, @ekzhu had in this PR. I have arrived a bit late to the party.

I started working on the client because I was playing/testing the Gemini models on AutoGen and the new Gemini SDK (google-genai) was just released, so I thought it would be a good moment. Additionally I had read about the need to have this client in previous issues.

The Gemini SDK has lots of features, way more than OpenAI. The implementation I’m sharing doesn’t include them all, however, I wanted to share the current state so we don’t duplicate work.

Good to mention it includes the important features (all I think) from the Autogen OpenAI client.

Here are the features the provided Gemini client support:

Generate text (well of course)
Json Mode and Structured Outputs
Function Calling i.e. tools
Streaming tokens as defined in create_stream and tested using the chainlit example.
Passing images to the model, e.g. I have tested the client with M1, where the web surfer sends images.

Missing features & TO DOs

Test with VertexAI
Image generation, i.e. the model returns a generated image.
Add rest of the models to mode_info.py
Lots of pyright/mypy warnings

Some preliminary tests I have run:

Works with M1 and chainlit.
Ran the tests in test_gemini inside test_openai_model_client.py

Important things to consider about the behavior of the Gemini SDK

By default Gemini tries to call functions (tools), this creates a conflict since Autogen is expected to call these tools.
- I had to explicitly disable automatic_function_calling in the create_args config.
Gemini doesn’t have a json_output config, put rather response_mime_type config. Where they not only support json but also Enums.
For the reasoning models, thoughts are currently not provided in the API (source), however, the Response schema does have a thought field, hence I included the necessary code to handle it, in case they provide the feature in the future.

I’ll continue working and testing the client, however, would be good to see if I can get any thumbs up . Thanks

Related issue number

#3741

Checks

I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally.
I've added tests (if relevant) corresponding to the changes introduced in this PR.
I've made sure all auto checks have passed.

yu-iskw · 2025-02-13T02:28:40Z

@gziz I have been implementing almost the same as this pull request, though my implementation isn't finished yet. If you don't mind, I want to address this. What do you think?

#5524

gziz · 2025-02-13T05:24:14Z

No problem, would love to review! @yu-iskw 👍

ekzhu · 2025-02-13T06:45:21Z

@gziz thanks for the PR, and let's move forward with #5524

yu-iskw · 2025-02-13T23:36:24Z

@gziz Thank you for your understanding. I will let you know, when I finish the implementation.

Gemini Client

184061e

Merge branch 'main' into gemini_client

6f78730

jackgerrits closed this Feb 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Gemini Client #5522

Gemini Client #5522

Uh oh!

gziz commented Feb 13, 2025

Uh oh!

yu-iskw commented Feb 13, 2025

Uh oh!

gziz commented Feb 13, 2025

Uh oh!

ekzhu commented Feb 13, 2025

Uh oh!

yu-iskw commented Feb 13, 2025

Uh oh!

Uh oh!

Gemini Client #5522

Gemini Client #5522

Uh oh!

Conversation

gziz commented Feb 13, 2025

Why are these changes needed?

PR Message

Related issue number

Checks

Uh oh!

yu-iskw commented Feb 13, 2025

Uh oh!

gziz commented Feb 13, 2025

Uh oh!

ekzhu commented Feb 13, 2025

Uh oh!

yu-iskw commented Feb 13, 2025

Uh oh!

Uh oh!