Skip to content

Conversation

@snwtz
Copy link

@snwtz snwtz commented May 7, 2025

  • new modules: dataloader and dataloader test (local, not optimized for GPU)

  • multispectral data loading and preprocessing

  • TODO: explicit caching and order validation

  • adapted main training script for MS data loading

Zina Gleaves added 3 commits May 1, 2025 15:46
This commit adds support for 5-channel multispectral data in the VAE architecture
while maintaining compatibility with Stable Diffusion 3's latent space requirements.

Key changes:
- Add AutoencoderKLMultispectral5Ch implementation with 5 input/output channels
- Implement 8x downsampling to match SD3's latent space requirements
- Add comprehensive test suite for multispectral VAE functionality
- Add training script for 5-channel multispectral data
- Update documentation with detailed implementation notes

Technical details:
- Uses 4 downsampling blocks for 8x downsampling
- Maintains 4-channel latent space for SD3 compatibility
- Implements group normalization (32 groups) for stable training
- Preserves spectral information through careful normalization
- Handles 16-bit multispectral data with proper scaling

Files changed:
- src/diffusers/models/autoencoders/autoencoder_kl_multispectral_5ch.py
- tests/models/autoencoders/test_models_autoencoder_kl_multispectral_5ch.py
- examples/multispectral/train_multispectral_vae_5ch.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant