Implementing Mosaic Diffusion into Patch-Diffusion #121

5thGenDev · 2024-02-18T21:09:18Z

Patch Diffusion can x2 training speed even on 256x256 ImageNet. If this works out between Mosaic Diffusion and Patch-Diffusion, that is potentially x10 cumulative boost. The issue is both have different training script so I'm thinking of copy+paste the features of Mosaic into Patch-Diffusion. Right now, I only ask where to find relevant code for

xFormer+FlashAttention - I'll be trying to swap FlashAttention-1 for FlashAttention-2
Precomputing latent
Low Precision LayerNorm and GroupNorm
FSDP
Scheduled EMA

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementing Mosaic Diffusion into Patch-Diffusion #121

Implementing Mosaic Diffusion into Patch-Diffusion #121

5thGenDev commented Feb 18, 2024 •

edited

Loading

Implementing Mosaic Diffusion into Patch-Diffusion #121

Implementing Mosaic Diffusion into Patch-Diffusion #121

Comments

5thGenDev commented Feb 18, 2024 • edited Loading

5thGenDev commented Feb 18, 2024 •

edited

Loading