[FEAT] Add Gemini support #15

akshaykalkunte · 2025-09-21T05:32:43Z

📌 Description

Adds support for evaluating Gemini models with GCP and Vertex AI with the OpenAI chat completions API
Cleans up sample_config.json
Adds code to populate environment variables placeholders in run configs

🔗 Related Issue(s)

NA

🛠️ Type of Change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality including new tasks)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Refactor / Code cleanup
Maintenance / Chore / Task
Other (please describe):

✅ How Has This Been Tested?

Gemini-2.5-Flash numbers looks okay. Tested an OpenAI and VLLM model to ensure nothing is broken.

Task Category	Task Name	Dataset/Benchmark	Metric	Gemini-2.5-Flash
Speech Recognition	ASR	Librispeech	WER	2.75
Paralinguistics	Emotion	MELD	llm_judge_binary	30
Paralinguistics	Gender	IEMOCAP	llm_judge_binary	85.5
Paralinguistics	Accent	VoxCeleb	llm_judge_binary	55.7
Paralinguistics	Speaker Recognition	mmau_mini	llm_judge_binary	61.5
Paralinguistics	Speaker Diarization	CallHome	WDER	41.83
Spoken Language Understanding	Spoken QA	public_sg_speech_qa_test	llm_judge_detailed	74.34
Spoken Language Understanding	Spoken Query QA	BigBench Audio	llm_judge_big_bench_audio	90.4
Spoken Language Understanding	Speech Translation	Covost2 (zh-CN->EN)	BLEU	27.1
Spoken Language Understanding	Spoken Dialogue Summarization	mnsc_sds (P3)	llm_judge_detailed	62.8
Spoken Language Understanding	Intent Classification	SLURP	llm_judge_binary	79
Audio Understanding	Scene Understanding	audiocaps_qa	llm_judge_detailed	34.82
Audio Understanding	Music Understanding	mu_chomusic_test	llm_judge_binary	72.9
Spoken Language Reasoning	Speech Instruction Following	IFEval	instruction_following_score	86.28
Spoken Language Reasoning	Speech Instruction Following	MTBench	llm_judge_mt_bench	69.88
Spoken Language Reasoning	Speech-to-Coding	Spider	sql_score (EM)	77.12
Spoken Language Reasoning	SPeech Function Calling	BFCL	bfcl_match_score	93.23
Safety and Security	Safety	advbench	redteaming_judge	97.5
Safety and Security	Spoofing	avspoof	llm_judge_binary	80.5
Aggregate	NA	NA	NA	69.35

Unit tests
Integration tests
Manual testing

Test Results / Screenshots (if applicable):

📸 Screenshots / Demos

NA

📋 Checklist

Code follows project style guidelines
Tests have been added/updated (if applicable)
Documentation has been updated (if applicable)
Linked relevant issue(s)
Self-reviewed my code

🙌 Additional Notes

NA

nhhoang96

LGTM

Can you also provide additional instructions on how to get started with Gemini endpoint setup in the landing-page README?

Please also update the endpoint support on landing-page README as well (See attached photo below for specific reference).

feat: Add Gemini support

a128c0e

akshaykalkunte changed the title ~~feat: Add Gemini support~~ [FEAT] Add Gemini support Sep 21, 2025

akshaykalkunte requested review from aman-servicenow, jash-mehta-3300, nhhoang96 and oluwanifemibamgbose September 21, 2025 05:34

nhhoang96 approved these changes Sep 22, 2025

View reviewed changes

Merge branch 'main' into feat/add-gemini-support

b8b8313

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FEAT] Add Gemini support #15

[FEAT] Add Gemini support #15

Uh oh!

akshaykalkunte commented Sep 21, 2025 •

edited

Loading

Uh oh!

nhhoang96 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[FEAT] Add Gemini support #15

Are you sure you want to change the base?

[FEAT] Add Gemini support #15

Uh oh!

Conversation

akshaykalkunte commented Sep 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📌 Description

🔗 Related Issue(s)

🛠️ Type of Change

✅ How Has This Been Tested?

📸 Screenshots / Demos

📋 Checklist

🙌 Additional Notes

Uh oh!

nhhoang96 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

akshaykalkunte commented Sep 21, 2025 •

edited

Loading