We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
enable_model_cpu_offload
1 parent 2ecd21b commit e10a9f5Copy full SHA for e10a9f5
diffusers-quantization.md
@@ -455,8 +455,8 @@ pipe = FluxPipeline.from_pretrained(
455
**bnb + `enable_model_cpu_offload`**:
456
| Precision | Memory after loading | Peak memory | Inference time |
457
|---------------|----------------------|-------------|----------------|
458
-| 4-bit | 12.584 GB | 17.281 GB | 12 seconds |
459
-| 8-bit | 19.273 GB | 24.432 GB | 27 seconds |
+| 4-bit | 12.383 GB | 12.383 GB | 17 seconds |
+| 8-bit | 19.182 GB | 23.428 GB | 27 seconds |
460
461
<details>
462
<summary>Example (Flux-dev with fp8 layerwise casting + group offloading):</summary>
0 commit comments