Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing Mosaic Diffusion into Patch-Diffusion #121

Open
5thGenDev opened this issue Feb 18, 2024 · 0 comments
Open

Implementing Mosaic Diffusion into Patch-Diffusion #121

5thGenDev opened this issue Feb 18, 2024 · 0 comments

Comments

@5thGenDev
Copy link

5thGenDev commented Feb 18, 2024

Patch Diffusion can x2 training speed even on 256x256 ImageNet. If this works out between Mosaic Diffusion and Patch-Diffusion, that is potentially x10 cumulative boost. The issue is both have different training script so I'm thinking of copy+paste the features of Mosaic into Patch-Diffusion. Right now, I only ask where to find relevant code for

  1. xFormer+FlashAttention - I'll be trying to swap FlashAttention-1 for FlashAttention-2
  2. Precomputing latent
  3. Low Precision LayerNorm and GroupNorm
  4. FSDP
  5. Scheduled EMA
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant