-
Notifications
You must be signed in to change notification settings - Fork 6.3k
[wip] unbloat QwenImage pipelines by introducing a pipeline-specific mixins to hold common methods #12322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good 👍🏽 Still on the fence on whether we use multiple levels of inheritance with QwenImageMixin. But other than that this looks good to me.
Perhaps we try with to do the same with Flux and SDXL to see if some common patterns show up for these Mixins.
return latents | ||
|
||
|
||
class QwenImagePipelineMixin(QwenImageMixin): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can also add additional Mixins here such as LoRA
With you. I am also on the fence. Since the mixins for QwenImage and QwenImageEdit primarily differ in But we can definitely uniformize the
Yes, I will work on that. |
|
||
return split_result | ||
|
||
def _get_qwen_prompt_embeds( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Decided to unify it to further reduce LOC. It's a private method anyway, so, we should be safe.
|
||
|
||
class QwenImagePipelineMixin(QwenImageMixin): | ||
def encode_prompt( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only differing encode_prompt()
to not introduce any breaking change.
What does this PR do?
As discussed with @DN6, this is a POC PR to evaluate the idea of a pipeline-specific Mixin class that would hold common methods shared across different task-specific pipelines.
Dhruv, some things to note when you review:
There's only a few differences betweenKeptencode_prompt()
of QwenImage andencode_prompt()
of QwenImageEdit: https://www.diffchecker.com/z7nJMsP4/. I believe we can at least turn the_get_qwen_prompt_embeds()
method into one but readability may be compromised due to that. Hence, I have chosen to haveQwenImagePipelineMixin
andQwenImageEditPipelineMixin
. Both of them subclass fromQwenImageMixin
, differing only inencode_prompt()
and_get_qwen_prompt_embeds()
.encode_prompt()
varying while unifying the_get_qwen_prompt_embeds()
method inQwenImageMixin
. Comments are in line.prepare_image()
is shared byQwenImageControlNetInpaintPipeline
andQwenImageControlNetPipeline
but not others. We can choose to move them topipeline_qwen_utils.py
but I am not strongly opinionated on this. Same applies for methods likeprepare_mask_latents()
. My preference here is to keep them as is for now.