An approach to maze solving using conditional denoising diffusion models. This project demonstrates how modern generative AI can be applied to sequential decision-making tasks by learning to "imagine" paths through mazes frame by frame.
This project implements a unique maze-solving approach using diffusion models instead of traditional pathfinding algorithms. The system generates a sequence of frames showing a path through a maze by gradually denoising random patterns into valid frames.
- Frame-by-frame maze path generation using diffusion models
- Conditional generation based on initial maze context
- Smooth visualization of solution paths
The system employs a conditional denoising diffusion model that:
- Takes two initial maze frames as context
- Generates subsequent frames through iterative denoising
- Produces visually coherent and valid maze solutions
The architecture uses a conditioned UNet that considers:
- Previous maze frames as context
- Learned valid movement patterns
- Maze structure constraints
The model is trained on sequences of maze frames showing valid paths:
- Generated using A* algorithm for optimal path finding
- Clear visualization of paths and walls
- Consistent entrance and exit points
- Highlighted solution paths
The trained model can:
- Generate valid maze solutions from a given starting point
- Produce smooth, visually coherent path sequences
- Bridge traditional pathfinding with modern generative AI
While not always finding the optimal path, the model consistently generates valid solutions, demonstrating the potential of applying diffusion models to structured decision-making tasks.
To train the model, first generate the datasets:
python data/generate_data.py --maze_size maze_size --num_train_mazes X --num_test_mazes Y --images_directory path --train_filename path --test_filename path --solved --num_steps Z
Then run the training script (supports multi-gpu distributed training):
torchrun --nproc_per_node X train.py
pip install -r requirements.txt
MIT