Skip to content

Conversation

@DBOYttt
Copy link

@DBOYttt DBOYttt commented Nov 24, 2025

  • Implement dynamic model discovery from OpenAI API with 1-hour caching
  • Filter models to only include vision-capable models (GPT-4o, GPT-4 variants)
  • Exclude O1/O3 models that don't support image inputs
  • Add OpenAIModule import to TasksModule for dependency injection
  • Make model selector scrollable in UI (max-height: 300px)

This fixes task execution failures when using non-vision models like O3-mini with computer-use agents that send screenshots.
This pull request enhances how available OpenAI models are managed and surfaced in the Bytebot agent. The most significant changes include dynamically fetching and caching OpenAI models that support vision (image inputs), improving fallback logic, and updating the models list to prioritize relevant options. Additionally, there are minor UI improvements to the select dropdown component.

Dynamic OpenAI Model Management:

  • Added a new method getAvailableModels in OpenAIService to fetch available models from the OpenAI API, filter for those supporting vision, cache them for one hour, and provide a fallback to a hardcoded list if needed. This ensures the agent always offers up-to-date and relevant model options.
  • Updated the hardcoded OPENAI_MODELS list to include only models that support vision (image input), with revised names, titles, and context windows.

Integration with Task Controller:

  • Modified the TasksController to fetch OpenAI models dynamically using the new getAvailableModels method, with a fallback to the hardcoded list if fetching fails. Models from other providers are still included based on API key presence.
  • Updated the TasksModule to import OpenAIModule so that OpenAIService can be injected into TasksController.

UI Improvement:

  • Improved the select dropdown in SelectContent by limiting its maximum height and enabling vertical scrolling, enhancing usability when many models are available.

- Implement dynamic model discovery from OpenAI API with 1-hour caching
- Filter models to only include vision-capable models (GPT-4o, GPT-4 variants)
- Exclude O1/O3 models that don't support image inputs
- Add OpenAIModule import to TasksModule for dependency injection
- Make model selector scrollable in UI (max-height: 300px)

This fixes task execution failures when using non-vision models like O3-mini
with computer-use agents that send screenshots.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant