Optimize image + text prompt ordering for better results #384

tberends · 2025-07-23T08:12:57Z

Description

This PR improves the ordering of content in requests that combine images with text prompts. Following Google's Gemini API best practices, text prompts are now placed after image parts in the contents array when using a single image with text.

Type of change

Bug fix (non-breaking change which fixes an issue)

How has this change been tested, please provide a testcase or example of how you tested the change?

According to the Gemini API documentation on image prompts, when using a single image with text, the recommended approach is to place the text prompt after the image part in the contents array. This ordering has been shown to produce significantly better results in practice.

In our testing with Process & Instrument Diagrams (P&IDs) using object detection, this reordering led to drastically improved accuracy in bounding box positioning. While the object labels were already accurate, the spatial precision of detected elements improved considerably with the optimized prompt ordering

review-notebook-app · 2025-07-23T08:13:02Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

CLAassistant · 2025-07-29T05:56:57Z

All committers have signed the CLA.

SkalskiP · 2025-07-29T13:14:44Z

Hi @tberends 👋🏻 Thanks a lot for this PR. It has been merged. Would you be willing to update the sv.Detections.from_vlm section of the supervision docs? We share prompting tips there, and I think adding this information would make a lot of sense.

Optimize image + text prompt ordering for better results

260606c

SkalskiP approved these changes Jul 29, 2025

View reviewed changes

SkalskiP merged commit c7988b1 into roboflow:main Jul 29, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize image + text prompt ordering for better results #384

Optimize image + text prompt ordering for better results #384

Uh oh!

tberends commented Jul 23, 2025 •

edited

Loading

Uh oh!

review-notebook-app bot commented Jul 23, 2025

Uh oh!

CLAassistant commented Jul 29, 2025 •

edited

Loading

Uh oh!

Uh oh!

SkalskiP commented Jul 29, 2025

Uh oh!

Uh oh!

Optimize image + text prompt ordering for better results #384

Optimize image + text prompt ordering for better results #384

Uh oh!

Conversation

tberends commented Jul 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

How has this change been tested, please provide a testcase or example of how you tested the change?

Uh oh!

review-notebook-app bot commented Jul 23, 2025

Uh oh!

CLAassistant commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

SkalskiP commented Jul 29, 2025

Uh oh!

Uh oh!

tberends commented Jul 23, 2025 •

edited

Loading

CLAassistant commented Jul 29, 2025 •

edited

Loading