You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Patch Diffusion can x2 training speed even on 256x256 ImageNet. If this works out between Mosaic Diffusion and Patch-Diffusion, that is potentially x10 cumulative boost. The issue is both have different training script so I'm thinking of copy+paste the features of Mosaic into Patch-Diffusion. Right now, I only ask where to find relevant code for
xFormer+FlashAttention - I'll be trying to swap FlashAttention-1 for FlashAttention-2
Precomputing latent
Low Precision LayerNorm and GroupNorm
FSDP
Scheduled EMA
The text was updated successfully, but these errors were encountered:
Patch Diffusion can x2 training speed even on 256x256 ImageNet. If this works out between Mosaic Diffusion and Patch-Diffusion, that is potentially x10 cumulative boost. The issue is both have different training script so I'm thinking of copy+paste the features of Mosaic into Patch-Diffusion. Right now, I only ask where to find relevant code for
The text was updated successfully, but these errors were encountered: