-
Notifications
You must be signed in to change notification settings - Fork 432
Allow models to run without all text encoder(s) #645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@rmatif is your comment for this pr specifically? ... kind of sounds not related. BTW, did you try one of the flux 8B "lite" prunes? |
|
@Green-Sky I think @rmatif meant that with this PR it's possible to drop T5, which makes Flux fit in only 8GB of system memory. |
This is exactly what I meant, sorry if I wasn't clear. With this PR, we can drop the heavy T5, so we can squeeze Flux into just an 8GB phone. @Green-Sky I just tested Flux.1-lite and the q4_k version can also fit now into those kinds of devices, although you can't run inference on resolutions larger than 512x512 due to the compute buffer, but I bet q3_k will do just fine. |
1562f0f to
4096d99
Compare
4096d99 to
9a2ef28
Compare
|
Thank you for your contribution. |


For now only Flux and SD3.x.
Just puts a warning instead of crashing when text encoders are missing, and then proceed without it.
TODOs (maybe in follow up PRs):
Comparisons:
SD3.5 Large Turbo (iq4_nl):
With t5_xxl:
Without t5_xxl:
Flux Schnell (iq4_nl imatrix):