[Code documentation] FluxPipeline & FluxControlNetModel with apply_group_offloading #10840
Replies: 4 comments 6 replies
-
FluxControlNetModel - offload_type="leaf_level"from diffusers import FluxTransformer2DModel, FluxPipeline, FluxControlNetModel, FluxControlNetPipeline
from transformers import T5EncoderModel
from diffusers.utils import load_image
import torch
from diffusers.hooks import apply_group_offloading
transformer = FluxTransformer2DModel.from_pretrained(
"black-forest-labs/FLUX.1-dev",
subfolder="transformer",
torch_dtype=torch.bfloat16
)
text_encoder_2 = T5EncoderModel.from_pretrained(
"black-forest-labs/FLUX.1-dev",
subfolder="text_encoder_2",
torch_dtype=torch.bfloat16
)
canny_controlnet = FluxControlNetModel.from_pretrained(
"Xlabs-AI/flux-controlnet-canny-diffusers",
torch_dtype=torch.bfloat16,
use_safetensors=True,
)
xlabs_canny_pipe = FluxControlNetPipeline.from_pretrained(
"black-forest-labs/FLUX.1-dev",
controlnet=canny_controlnet,
transformer=transformer,
text_encoder_2=text_encoder_2,
torch_dtype=torch.bfloat16
)
apply_group_offloading(
xlabs_canny_pipe.transformer,
offload_type="leaf_level",
offload_device=torch.device("cpu"),
onload_device=torch.device("cuda"),
)
apply_group_offloading(
xlabs_canny_pipe.text_encoder,
offload_device=torch.device("cpu"),
onload_device=torch.device("cuda"),
offload_type="leaf_level"
)
apply_group_offloading(
xlabs_canny_pipe.text_encoder_2,
offload_device=torch.device("cpu"),
onload_device=torch.device("cuda"),
offload_type="leaf_level"
)
apply_group_offloading(
xlabs_canny_pipe.vae,
offload_device=torch.device("cpu"),
onload_device=torch.device("cuda"),
offload_type="leaf_level"
)
apply_group_offloading(
xlabs_canny_pipe.controlnet,
offload_device=torch.device("cpu"),
onload_device=torch.device("cuda"),
offload_type="leaf_level"
)
control_image = load_image("https://huggingface.co/XLabs-AI/flux-controlnet-canny-diffusers/resolve/main/canny_example.png")
image = xlabs_canny_pipe(
'A bear with scarf',
control_image=control_image,
controlnet_conditioning_scale=0.8
).images[0]
image.save("flux_controlnet_apply_group_offloading.png") |
Beta Was this translation helpful? Give feedback.
-
FluxPipeline - offload_type="leaf_level" and "block_level" [NOT WORKING requires a lot of RAM: CUDA OOM with block_level (Tested with 1, 2, 4)] import torch
from diffusers import FluxPipeline
from diffusers.hooks import apply_group_offloading
model_id = "black-forest-labs/FLUX.1-dev"
dtype = torch.bfloat16
pipe = FluxPipeline.from_pretrained(
model_id,
torch_dtype=dtype,
)
apply_group_offloading(
pipe.transformer,
offload_device=torch.device("cpu"),
onload_device=torch.device("cuda"),
offload_type="block_level",
num_blocks_per_group=4,
non_blocking=True
)
apply_group_offloading(
pipe.text_encoder,
offload_device=torch.device("cpu"),
onload_device=torch.device("cuda"),
offload_type="block_level",
num_blocks_per_group=4,
non_blocking=True
)
apply_group_offloading(
pipe.text_encoder_2,
offload_device=torch.device("cpu"),
onload_device=torch.device("cuda"),
offload_type="block_level",
num_blocks_per_group=4,
non_blocking=True
)
apply_group_offloading(
pipe.vae,
offload_device=torch.device("cpu"),
onload_device=torch.device("cuda"),
offload_type="leaf_level"
)
prompt="A cat wearing sunglasses and working as a lifeguard at pool."
generator = torch.Generator().manual_seed(181201)
image = pipe(
prompt,
width=576,
height=1024,
num_inference_steps=30,
generator=generator
).images[0]
print("----Inference complete..")
image.save("flux_apply_group_offloading.png") |
Beta Was this translation helpful? Give feedback.
-
how much RAM do you have? I got it to run but it needs around 50GB of RAM or you will get OOM, this was just for the transformer model, I encoded the prompt before the generation. |
Beta Was this translation helpful? Give feedback.
-
Since the code is added in documentation. closing.... |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Note: output not tested as it is very slow at my end but the inference starts working.
FluxPipeline - offload_type="leaf_level"
Beta Was this translation helpful? Give feedback.
All reactions