How does the training of gligen work? #10709
Unanswered
Putzzmunta
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi everybody,
I have some hard time understanding how and why the training script of the gligen example works.
In the
StableDiffusionGLIGENTextImagePipeline
there are the following attributes listed for the cross attention:Whereas in the `train_gligen_text.py the following ones are listed:
How is this working, when the attributes are different?
And then i have another question:
Since I have a custom dataset I needed some adjustments in the example code. There I stumbled over the way the gligen-phrases get encoded.
I wrote my own custom method, that i called within the dataset getItem method to encode the phrases for each box:
I don't understand why there is a difference in how the gligen pipeline does this encoding and how its done for the example project.
Thank you all in advance :)
Beta Was this translation helpful? Give feedback.
All reactions