-
Notifications
You must be signed in to change notification settings - Fork 5
feat: support multiple input images for img2img models #861
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughAdds multi-image support to the img2img pipeline. Introduces Img2ImgModels._multi_image_models() and validate_multi_image_models() which enforce model compatibility and a 20-image limit. img2img() signature changed from single-image params to Estimated code review effort🎯 4 (Complex) | ⏱️ ~40 minutes
Possibly related PRs
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
daras_ai_v2/stable_diffusion.py (1)
578-604: Bug:width/heightuses last image dimensions; dead code in conditional.Two issues in the GPT Image 1 multi-image handling:
Size inconsistency: The
sizeparameter at line 602 useswidthandheightfrom the last processed image. If images have different aspect ratios, this may cause unexpected behavior since all outputs will use the last image's computed dimensions.Dead code: The conditional at line 583 checks for
Img2ImgModels.dall_e.name, but this code is inside theImg2ImgModels.gpt_image_1.namecase block (line 575), so this branch will never execute.🔎 Proposed fix
case Img2ImgModels.gpt_image_1.name: from openai import NOT_GIVEN, OpenAI payload_input_images = [] + # Use first image dimensions for consistent output size + first_image_bytes = init_image_bytes[0] + first_height, first_width, _ = bytes_to_cv2_img(first_image_bytes).shape + _resolution_check(first_width, first_height) + width, height = _get_gpt_image_1_img_size(first_width, first_height) + for idx, image_bytes in enumerate(init_image_bytes): - init_height, init_width, _ = bytes_to_cv2_img(image_bytes).shape - _resolution_check(init_width, init_height) - - if selected_model == Img2ImgModels.dall_e.name: - edge = _get_dall_e_img_size(init_width, init_height) - width, height = edge, edge - response_format = "b64_json" - else: - width, height = _get_gpt_image_1_img_size(init_width, init_height) - response_format = NOT_GIVEN - image = resize_img_pad(image_bytes, (width, height)) image = rgb_img_to_rgba(image) payload_input_images.append((f"image_{idx}.png", image)) client = OpenAI() with capture_openai_content_policy_violation(): response = client.images.edit( model=img2img_model_ids[Img2ImgModels[selected_model]], prompt=prompt, image=payload_input_images, n=num_outputs, size=f"{width}x{height}", - response_format=response_format, + response_format=NOT_GIVEN, quality=gpt_image_1_quality, )
🧹 Nitpick comments (1)
daras_ai_v2/stable_diffusion.py (1)
521-522: Fix implicitOptionaltype annotation.Per PEP 484, using
= Nonewithout explicitly includingNonein the type is prohibited. The type should explicitly includeNone.🔎 Proposed fix
init_images: str | list[str], - init_image_bytes: bytes | list[bytes] = None, + init_image_bytes: bytes | list[bytes] | None = None,
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
daras_ai_v2/stable_diffusion.py(8 hunks)recipes/GoogleImageGen.py(3 hunks)recipes/Img2Img.py(7 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{py,js,ts,tsx,java,cs,cpp,c,go,rb,php}
📄 CodeRabbit inference engine (.cursor/rules/devs-rules.mdc)
Format code in reverse topological order: place the main() function at the top and dependencies below it
Files:
recipes/Img2Img.pydaras_ai_v2/stable_diffusion.pyrecipes/GoogleImageGen.py
🧬 Code graph analysis (3)
recipes/Img2Img.py (2)
daras_ai_v2/stable_diffusion.py (2)
validate_multi_image_models(888-901)Img2ImgModels(100-138)daras_ai_v2/safety_checker.py (1)
safety_checker(13-19)
daras_ai_v2/stable_diffusion.py (2)
daras_ai/image_input.py (2)
bytes_to_cv2_img(102-112)resize_img_pad(21-26)daras_ai_v2/exceptions.py (1)
UserError(58-65)
recipes/GoogleImageGen.py (1)
daras_ai_v2/stable_diffusion.py (2)
validate_multi_image_models(888-901)Img2ImgModels(100-138)
🪛 Ruff (0.14.8)
recipes/Img2Img.py
174-174: Probable use of requests call without timeout
(S113)
daras_ai_v2/stable_diffusion.py
522-522: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
896-898: Avoid specifying long messages outside the exception class
(TRY003)
901-901: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test (3.10.12, 1.8.3)
🔇 Additional comments (11)
recipes/GoogleImageGen.py (1)
168-179: LGTM!The parameter rename from
init_imagetoinit_imagesaligns with the updatedimg2imgsignature, and passing a single URL string is correctly handled by the normalization logic inimg2img.recipes/Img2Img.py (4)
46-46: LGTM!The type annotation update to
HttpUrlStr | list[HttpUrlStr] | Nonecorrectly reflects the new multi-image input capability.
103-104: LGTM!Enabling
accept_multiple_files=Trueand restricting toaccept=["image/*"]correctly supports the multi-image upload feature.
176-183: LGTM!The safety checker correctly validates the text prompt once and iterates through each input image for individual safety checks.
219-231: LGTM!The
img2imgcall correctly passes the multi-image parameters. The variableinit_images_bytesis correctly passed to theinit_image_bytesparameter, which acceptsbytes | list[bytes].daras_ai_v2/stable_diffusion.py (6)
136-138: LGTM!The
_multi_image_modelsclassmethod correctly identifies models that support multiple input images.
533-536: LGTM!The normalization logic correctly converts single-value inputs to lists for uniform processing.
549-574: LGTM!Flux Pro Kontext correctly uses only the first image since it doesn't support multiple images, and the validation function prevents multiple images from reaching this code path.
617-636: LGTM!The Nano Banana models correctly pass the full list of
init_imagesviaimage_urls, leveraging their multi-image support.
637-656: LGTM!The standard diffusion path correctly passes
init_imagestocall_sd_multi. For single-image models, the validation ensures only one image is present.
887-901: LGTM!The validation function correctly enforces multi-image model compatibility and the 20-image limit. The detailed error messages are appropriate for user-facing feedback.
| if not state["selected_model"]: | ||
| state["output_images"] = image_urls | ||
| return # Break out of the generator | ||
| validate_multi_image_models(Img2ImgModels[request.selected_model], image_urls) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Validation mismatch: validates multiple URLs but only uses one image.
The validation checks image_urls (up to 10 search results), but the subsequent logic (lines 135-150) only selects and uses a single image for processing. This will incorrectly reject requests when using single-image models with multiple search results, even though only one image is actually passed to img2img.
Move the validation after the single image is selected, or validate against the actual image being used:
🔎 Proposed fix
- validate_multi_image_models(Img2ImgModels[request.selected_model], image_urls)
-
selected_image_bytes = None
for selected_image_url in image_urls:
try:
@@ -145,6 +143,8 @@ class GoogleImageGenPage(BasePage):
if not selected_image_bytes:
raise ValueError("Could not find an image! Please try another query?")
+ validate_multi_image_models(Img2ImgModels[request.selected_model], [selected_image_url])
+
selected_image_url = upload_file_from_bytes(Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In recipes/GoogleImageGen.py around line 132 (and the selection logic at
~135-150), the code validates image_urls (potentially multiple results) but
later only selects a single image for img2img; move or change the validation to
match the actual image used: after the single image is chosen (the code that
picks image_urls[0] or the selected index around lines 135-150) call the
validator with that single image (or use a single-image validation helper)
instead of validating the full image_urls list, or alternatively validate only
the first element before proceeding so the validation reflects the image
actually passed to img2img.
recipes/Img2Img.py
Outdated
| init_images_bytes = [] | ||
| for img_url in init_images: | ||
| init_images_bytes.append(requests.get(img_url).content) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add timeout to HTTP request to prevent indefinite hangs.
The requests.get call lacks a timeout, which can cause the request thread to hang indefinitely if the remote server is unresponsive.
🔎 Proposed fix
init_images_bytes = []
for img_url in init_images:
- init_images_bytes.append(requests.get(img_url).content)
+ init_images_bytes.append(requests.get(img_url, timeout=30).content)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| init_images_bytes = [] | |
| for img_url in init_images: | |
| init_images_bytes.append(requests.get(img_url).content) | |
| init_images_bytes = [] | |
| for img_url in init_images: | |
| init_images_bytes.append(requests.get(img_url, timeout=30).content) |
🧰 Tools
🪛 Ruff (0.14.8)
174-174: Probable use of requests call without timeout
(S113)
🤖 Prompt for AI Agents
In recipes/Img2Img.py around lines 172 to 174, the requests.get calls that build
init_images_bytes have no timeout and can hang indefinitely; update the code to
call requests.get(img_url, timeout=10) (or another sensible timeout) and wrap
the request in a try/except catching requests.exceptions.RequestException to
handle timeouts/connection errors (log or raise a descriptive error and skip or
fail gracefully) so the function never blocks forever on unresponsive servers.
3903146 to
4b1944f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
daras_ai_v2/stable_diffusion.py (1)
578-605: Remove dead code checking forImg2ImgModels.dall_e.nameinside thegpt_image_1case block (line 583).At line 575, the match statement is already in the
case Img2ImgModels.gpt_image_1.name:block. The condition at line 583 that checksif selected_model == Img2ImgModels.dall_e.name:can never be true, sinceselected_modelis guaranteed to begpt_image_1.name. This dead code branch should be removed, and the gpt_image_1 sizing logic (lines 588-589) should execute unconditionally.
♻️ Duplicate comments (1)
recipes/Img2Img.py (1)
172-174: The missing timeout issue from the previous review remains unaddressed.This is a duplicate of the earlier review comment. The
requests.getcall still lacks a timeout parameter, which can cause indefinite hangs if the remote server is unresponsive.
🧹 Nitpick comments (2)
daras_ai_v2/stable_diffusion.py (2)
521-522: Fix PEP 484 style: explicitNonetype.Line 522 should explicitly include
Nonein the type union to comply with PEP 484.🔎 Proposed fix
init_images: str | list[str], - init_image_bytes: bytes | list[bytes] = None, + init_image_bytes: bytes | list[bytes] | None = None,
888-901: LGTM! Multi-image validation correctly implemented.The validation function properly enforces model compatibility and the 20-image limit with clear, user-friendly error messages.
Optional style improvement: Static analysis suggests avoiding long inline exception messages (TRY003). While the current implementation is clear and maintainable, you could optionally extract these to exception class attributes or constants if you want to strictly follow the style guide.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
daras_ai_v2/stable_diffusion.py(8 hunks)recipes/GoogleImageGen.py(3 hunks)recipes/Img2Img.py(6 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- recipes/GoogleImageGen.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{py,js,ts,tsx,java,cs,cpp,c,go,rb,php}
📄 CodeRabbit inference engine (.cursor/rules/devs-rules.mdc)
Format code in reverse topological order: place the main() function at the top and dependencies below it
Files:
daras_ai_v2/stable_diffusion.pyrecipes/Img2Img.py
🧬 Code graph analysis (2)
daras_ai_v2/stable_diffusion.py (2)
daras_ai_v2/safety_checker.py (1)
capture_openai_content_policy_violation(62-73)daras_ai_v2/exceptions.py (1)
UserError(58-65)
recipes/Img2Img.py (2)
daras_ai_v2/stable_diffusion.py (3)
validate_multi_image_models(888-901)Img2ImgModels(100-138)controlnet(663-706)daras_ai_v2/safety_checker.py (1)
safety_checker(13-19)
🪛 Ruff (0.14.8)
daras_ai_v2/stable_diffusion.py
522-522: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
896-898: Avoid specifying long messages outside the exception class
(TRY003)
901-901: Avoid specifying long messages outside the exception class
(TRY003)
recipes/Img2Img.py
174-174: Probable use of requests call without timeout
(S113)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test (3.10.12, 1.8.3)
🔇 Additional comments (11)
recipes/Img2Img.py (6)
21-21: LGTM! Multi-image type support added correctly.The import and type signature changes properly enable multi-image input support.
Also applies to: 46-46
103-104: LGTM! File uploader configured for multiple images.The uploader correctly accepts multiple image files with appropriate MIME type restrictions.
166-170: LGTM! Input normalization and validation properly implemented.The code correctly normalizes single inputs to lists and validates model compatibility before processing.
179-183: LGTM! Safety checks correctly applied per-image.The safety checker is properly invoked once for the text prompt and once for each input image.
195-204: LGTM! Multi-image support for instruct_pix2pix.The function call correctly passes the list of images to the instruct_pix2pix pipeline.
206-233: LGTM! Multi-image parameters correctly propagated to downstream functions.The controlnet call properly replicates images for each controlnet model, and the img2img call correctly passes both image URLs and bytes as lists along with the quality parameter.
daras_ai_v2/stable_diffusion.py (5)
136-138: LGTM! Multi-image model registry correctly implemented.The classmethod properly defines which models support multiple input images, following the existing pattern for model categorization.
533-536: LGTM! Input normalization correctly handles both single and multiple inputs.The normalization logic ensures consistent list-based processing regardless of whether single or multiple inputs are provided.
550-574: LGTM! Single-image handling for flux_pro_kontext is correct.The code correctly uses
init_images[0]since flux_pro_kontext is not in the multi-image models set. The validation at the caller level ensures only one image is provided for this model.
617-636: LGTM! Multi-image handling for nano_banana models is correct.The code properly passes the list of image URLs to the nano_banana models, which support multiple input images.
637-656: LGTM! Standard diffusion models correctly receive single-image list.The code passes
init_imagesas a list to the standard diffusion pipeline. Since these models aren't in_multi_image_models(), validation ensures the list contains exactly one image.
|
You have run out of free Bugbot PR reviews for this billing cycle. This will reset on January 27. To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
recipes/Img2Img.py (1)
217-221: Add timeout and improve error handling for HTTP requests.The
requests.get()call lacks a timeout, which can cause indefinite hangs if the server is unresponsive. Additionally, whileraise_for_status()is called, it would be more robust to wrap the request in a try-except block to handle connection errors gracefully.🔎 Proposed fix
init_images_bytes = [] for img_url in init_images: - r = requests.get(img_url) - raise_for_status(r, is_user_url=True) - init_images_bytes.append(r.content) + try: + r = requests.get(img_url, timeout=30) + raise_for_status(r, is_user_url=True) + init_images_bytes.append(r.content) + except requests.exceptions.RequestException as e: + raise UserError( + f"Failed to fetch image from {img_url}. Please ensure the URL is valid and accessible." + ) from eBased on past review comments and static analysis hints.
🧹 Nitpick comments (1)
daras_ai_v2/stable_diffusion.py (1)
521-522: Make Optional type explicit in function signature.The type annotation for
init_image_bytesshould explicitly includeNoneto comply with PEP 484.🔎 Proposed fix
init_images: str | list[str], - init_image_bytes: bytes | list[bytes] = None, + init_image_bytes: bytes | list[bytes] | None = None,Based on static analysis hints.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
daras_ai_v2/stable_diffusion.py(9 hunks)recipes/Img2Img.py(6 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{py,js,ts,tsx,java,cs,cpp,c,go,rb,php}
📄 CodeRabbit inference engine (.cursor/rules/devs-rules.mdc)
Format code in reverse topological order: place the main() function at the top and dependencies below it
Files:
recipes/Img2Img.pydaras_ai_v2/stable_diffusion.py
🧬 Code graph analysis (2)
recipes/Img2Img.py (5)
daras_ai_v2/exceptions.py (2)
UserError(58-65)raise_for_status(19-49)daras_ai_v2/stable_diffusion.py (2)
validate_multi_image_models(886-899)Img2ImgModels(100-138)daras_ai_v2/safety_checker.py (1)
safety_checker(13-19)daras_ai_v2/wa_markdown_renderer.py (1)
image(84-88)widgets/workflow_search.py (1)
get(57-58)
daras_ai_v2/stable_diffusion.py (2)
daras_ai_v2/safety_checker.py (1)
capture_openai_content_policy_violation(62-73)daras_ai_v2/exceptions.py (1)
UserError(58-65)
🪛 Ruff (0.14.8)
recipes/Img2Img.py
219-219: Probable use of requests call without timeout
(S113)
daras_ai_v2/stable_diffusion.py
522-522: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
894-896: Avoid specifying long messages outside the exception class
(TRY003)
899-899: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test (3.10.12, 1.8.3)
🔇 Additional comments (10)
daras_ai_v2/stable_diffusion.py (4)
136-138: LGTM! Multi-image model definition is clear.The classmethod correctly identifies the three models that support multiple input images.
886-899: LGTM! Validation logic is sound.The function correctly validates that:
- Models not supporting multi-image receive only a single image
- Maximum of 20 images is enforced
The error messages provide clear guidance to users.
533-536: Guard against wrapping None in a list.If
init_image_bytesisNone, line 536 will create[None]instead of keeping it asNoneor an empty list. This could cause issues downstream when the code expects actual bytes objects.🔎 Proposed fix
if isinstance(init_images, str): init_images = [init_images] -if isinstance(init_image_bytes, bytes): - init_image_bytes = [init_image_bytes] +if init_image_bytes is not None: + if isinstance(init_image_bytes, bytes): + init_image_bytes = [init_image_bytes]Likely an incorrect or invalid review comment.
578-593: The code properly supports multi-image handling for GPT Image 1 API.The OpenAI
images.edit()API accepts both single images and arrays of images. The code correctly formats multiple images as a list of tuples(filename, image_data), which is the proper multipart form-data representation that the OpenAI Python client expects. No API incompatibility exists.recipes/Img2Img.py (6)
46-46: LGTM! Field type correctly supports multi-image input.The updated type annotation properly handles single URLs, lists of URLs, and None values.
103-104: LGTM! UI now accepts multiple image uploads.The file uploader correctly enables multi-file selection with appropriate MIME type filtering.
166-170: LGTM! Input normalization and validation are properly sequenced.The code correctly normalizes single images to lists before validating multi-image model compatibility.
226-227: LGTM! Multi-image parameters correctly passed to img2img.The updated call properly passes lists of images and image bytes along with the new
gpt_image_1_qualityparameter.
175-179: Clarify the intent of conditional safety checks.The text prompt safety check is conditional on
request.text_promptexisting (line 175-176), while the image safety checks iterate only over items ininit_images(lines 178-179). Both are conditional in different ways. Confirm whether this asymmetric approach is intentional—specifically, whether text prompts should always be validated if provided, or if the current behavior (only checking when the prompt is non-empty) is by design.
202-202: Duplication of images by number of ControlNet models is correct.When multiple ControlNets are selected, each model requires a conditioning image for proper batching. The code correctly duplicates the input image so each ControlNet receives it as guidance. This is the expected behavior for the common use case where a single image guides all ControlNet models.
Q/A checklist
How to check import time?
You can visualize this using tuna:
To measure import time for a specific library:
To reduce import times, import libraries that take a long time inside the functions that use them instead of at the top of the file:
Legal Boilerplate
Look, I get it. The entity doing business as “Gooey.AI” and/or “Dara.network” was incorporated in the State of Delaware in 2020 as Dara Network Inc. and is gonna need some rights from me in order to utilize my contributions in this PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Dara Network Inc can use, modify, copy, and redistribute my contributions, under its choice of terms.