Skip to content

Conversation

@alberthu233
Copy link
Contributor

@alberthu233 alberthu233 commented Oct 29, 2025

Summary
This PR extracts and contributes the GeminiAgent and Gemini computer-use tool from #169 and made requested changes.

Will add CLI integration for hud eval in a followup commit.


Note

Introduce a Gemini MCP agent and computer-use tool, integrate it into the CLI eval flow, and add supporting settings, types, tests, and dependency.

  • Agents:
    • Add GeminiAgent using Google GenAI with MCP tool execution, tool mapping, screenshot trimming, and function response formatting.
    • Export GeminiAgent in hud/agents/__init__.py.
  • Computer Tools:
    • Add GeminiComputerTool mapping Gemini computer-use functions (click_at, navigate, drag_and_drop, etc.) to executor actions with normalized coords and URL metadata.
    • Expose in hud/tools/__init__.py and hud/tools/computer/__init__.py.
  • CLI (eval):
    • Add AgentType.GEMINI and integrate Gemini into agent selection, builder, and dataset/single-task flows with default model and API key checks.
    • Update prompts/help text to include Gemini.
  • Settings/Types:
    • Add settings.gemini_api_key and Gemini display/rescale defaults in computer_settings.
    • Extend AgentType with gemini.
    • Enhance ContentResult to carry url and emit __URL__: metadata for screenshots.
  • Playwright:
    • Guard wait_for_load_state type and reuse existing CDP context/page.
  • Dependencies:
    • Add google-genai to pyproject.toml.
  • Tests:
    • Add comprehensive tests for GeminiAgent and CLI build_agent for Gemini.

Written by Cursor Bugbot for commit ff1a3ae. This will update automatically on new commits. Configure here.

@alberthu233 alberthu233 requested a review from Parth220 October 29, 2025 17:20
@alberthu233 alberthu233 marked this pull request as ready for review October 29, 2025 18:20
Copy link
Contributor

@Parth220 Parth220 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Thanks for addressing my changes from #169 and adding CLI suport.

Can you make sure to get ruff check and linting passing before merging?

arguments=final_args,
gemini_name=func_name, # type: ignore[arg-type]
)
collected_tool_calls.append(tool_call)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Undefined Argument in Constructor Causes Validation Issues

The MCPToolCall constructor is passed a gemini_name argument not defined in its schema, noted by a type: ignore comment. This may cause runtime validation errors or silently drop the field, making subsequent getattr calls fragile.

Fix in Cursor Fix in Web

@Parth220 Parth220 merged commit 638d78c into hud-evals:main Oct 29, 2025
5 checks passed
@alberthu233 alberthu233 deleted the feature/gemini-agent-tool branch November 6, 2025 04:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants