[bug]: Flux Extremely Slow in Invoke Compared to ComfyUI and Forge #7612

Yaruze66 · 2025-02-02T22:36:16Z

Is there an existing issue for this problem?

I have searched the existing issues

Operating system

Windows 11 23H2 22631.4751

GPU vendor

Nvidia (CUDA)

GPU model

RTX 2070 Super, driver version: 565.90

GPU VRAM

8GB

Version number

5.6.0

Browser

Google Chrome 132.0.6834.160, Invoke Client

Python dependencies

No response

What happened

Hello,

I’m experiencing an issue with Flux models running extremely slowly in Invoke. I’m using Invoke version 5.6.0, installed via Launcher. For comparison, in ComfyUI I achieve a generation speed of approximately 3.2–3.3 seconds per iteration at a resolution of 832×1216. However, using the same resolution in Invoke results in a staggering ~20 seconds per iteration, and I eventually encounter an OOM error. This is despite ComfyUI consuming significantly less VRAM, RAM, and swap. Even Stable Diffusion WebUI Forge runs at roughly the same speed as ComfyUI—although it uses noticeably more RAM, which sometimes leads to OOM errors (when I use multiple LoRAs).

I’m using the following models in ComfyUI:

flux1-dev-Q8_0.gguf
majicFlus 麦橘超然

t5xxl_fp16.safetensors
clip_l.safetensors

ae.safetensors

I attempted to import text encoders into Invoke, but was unsuccessful, so I loaded the provided analogues instead.

I’ve also tried adding enable_partial_loading: true in the invokeai.yaml file and experimenting with various combinations of max_cache_ram_gb, max_cache_vram_gb and device_working_mem_gb, but nothing seems to make a fundamental difference. (Note: “Nvidia sysmem fallback” is also disabled.)

I really enjoy using Invoke—I periodically download it to see how it evolves—but I always end up returning to ComfyUI because even the SDXL models in Invoke run slightly slower, and I encounter OOM errors at resolutions where ComfyUI doesn’t even require tiling. I’d love to use Invoke because of its pleasant UI and fantastic inpainting capabilities, but unfortunately, optimization for non-high-end configurations still leaves much to be desired. If needed, I can provide additional information to help diagnose the issue.

Thank you for your attention.

What you expected to happen

I acknowledge that my configuration (8GB of VRAM and 32GB of RAM) isn’t ideal for the demanding Flux models. Nevertheless, both ComfyUI and Forge handle these models at speeds that are acceptable for me. I would like to confirm that the issue isn’t on my end, and I hope to see better optimization in Invoke. This is especially important since img2img, even at higher SDXL resolutions, and upscaling consistently lead to OOM errors in Invoke—whereas ComfyUI performs swiftly at even higher resolutions with lower resource usage. Additionally, when using tiled decoding in the VAE, ComfyUI is capable of handling extraordinarily high resolutions on a system with just 8GB of VRAM by Invoke’s standards.

How to reproduce the problem

Use an Nvidia GPU with 8GB of VRAM; and 32GB of RAM
Standard install Invoke 5.6.0 via Launcher
Disable “Nvidia sysmem fallback” in Nvidia driver settings
Add enable_partial_loading: true to the Invoke.yaml file
Generate a 832x1216 or 1024x1024 image to canvas using flux1-dev-Q8_0.gguf or majicFlus 麦橘超然 (any other Flux model of the same size can be used) + t5xxl_fp16, clip_l and vae

Additional context

No response

Discord username

No response

The text was updated successfully, but these errors were encountered:

Yaruze66 · 2025-02-07T11:13:57Z

I conducted additional tests using SDXL models in Invoke (versions 5.6.0 and 5.6.1rc1), as Flux models failed to generate any output. Below are my findings compared to ComfyUI:

At 832x1216 resolution, Invoke matches ComfyUI’s generation speed (~1.90 it/s) and VRAM/RAM usage only when enable_partial_loading: true is disabled. Adding LoRAs under these conditions has minimal impact on speed.
Enabling enable_partial_loading: true reduces speed to 1.3-1.4 it/s.
Combining max_cache_ram_gb: 28 improves speed to ~1.60 it/s.
Adding max_cache_vram_gb: 5 further increases speed to ~1.8 it/s but introduces OOM errors during img2img upscaling (even at x1.5). In contrast, ComfyUI handles x2.5 upscales (non-img2img, dedicated upscale models) without issues.
With enable_partial_loading: true active, adding multiple LoRAs drops generation speed to 1 it/s. Experimentation with max_cache_vram_gb and device_working_mem_gb yielded no viable solutions—only OOM errors or unacceptably slow speeds.

I absolutely love the incredible inpainting, outpainting, and all those nifty image editing features that Invoke offers. That said, for me, all this potential is a bit held back by Invoke’s pretty high demands. I’m not exactly a tech expert, and I don’t really understand all the “magic” the Comfy folks work behind the scenes—but for users with limited VRAM, ComfyUI is a real lifesaver. It even lets you crank out high-resolution images with Flux models (FP8/Q8_0 + t5xxl_fp16 + a few LoRAs)!

The only downside is that ComfyUI can’t quite match Invoke’s sleek, user-friendly interface or its mind-blowing inpainting capabilities. I really hope the InvokeAI team takes note and manages to optimize things to at least ComfyUI’s level—if not even better!

Thanks!

Yaruze66 added the bug Something isn't working label Feb 2, 2025

Yaruze66 changed the title ~~Flux Extremely Slow in Invoke Compared to ComfyUI and Forge~~ [bug]: Flux Extremely Slow in Invoke Compared to ComfyUI and Forge Feb 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bug]: Flux Extremely Slow in Invoke Compared to ComfyUI and Forge #7612

[bug]: Flux Extremely Slow in Invoke Compared to ComfyUI and Forge #7612

Yaruze66 commented Feb 2, 2025 •

edited

Loading

Yaruze66 commented Feb 7, 2025

[bug]: Flux Extremely Slow in Invoke Compared to ComfyUI and Forge #7612

[bug]: Flux Extremely Slow in Invoke Compared to ComfyUI and Forge #7612

Comments

Yaruze66 commented Feb 2, 2025 • edited Loading

Is there an existing issue for this problem?

Operating system

GPU vendor

GPU model

GPU VRAM

Version number

Browser

Python dependencies

What happened

What you expected to happen

How to reproduce the problem

Additional context

Discord username

Yaruze66 commented Feb 7, 2025

Yaruze66 commented Feb 2, 2025 •

edited

Loading