This repository contains the code and experiments accompanying the master’s thesis:
Jakub Dvorak, Reconstructing Cryo-ET Tomograms with a 2D Diffusion Prior, Technical University of Munich, 2025.
It contains the implementation of the methods and experiments described in the thesis.
The code allows training 2D diffusion priors on cryo-ET slices and using them for tomographic reconstruction.
The environment can be built in two ways, depending on your compute setup.
If you encounter errors due to updated library versions, a 'requirements.txt' file is provided with the exact package versions used during testing.
The repository includes a .devcontainer/ folder with a Dockerfile and devcontainer.json for reproducible GPU-enabled development inside VS Code.
Steps:
- Clone the repository
- Open in VS Code
- Select “Reopen in Container”
The container installs PyTorch, CUDA 11.8, tomosipo, and other required dependencies. You may need to adjust the "mounts" section in .devcontainer/devcontainer.json to match your file system.
For HPC clusters supporting Enroot and Slurm, use the provided build script: build_image.sh.
This script will:
- Import a base PyTorch CUDA image
- Run root and user setup scripts from
setup/ - Export the final container as a
.sqshimage - Clean up intermediate containers
Steps:
- Adjust BASE_STORAGE_PATH in build_image.sh to point to your storage directory
- Submit the job via Slurm
- After completion, the container will be available at:
$BASE_STORAGE_PATH/image.sqsh
You can then run jobs inside this image on your cluster.
├── .devcontainer/ # VS Code DevContainer setup (Dockerfile, config)
├── setup/ # HPC setup scripts (root_setup.sh, user_setup.sh, build_image.sh)
├── src/ # Core source code for diffusion priors and reconstruction
├── configs/ # Example Hydra/YAML configs for training and reconstruction
├── train_diffusion.py # Entry point for training diffusion priors
├── reconstruct.py # Entry point for reconstruction
└── README.md # This file
The datasets used in the thesis are publicly available:
-
SHREC 2021 (synthetic data)
https://dataverse.nl/dataset.xhtml?persistentId=doi:10.34894/XRTJMA -
EMPIAR-10045 (experimental data)
https://www.ebi.ac.uk/empiar/EMPIAR-10045/ -
EMPIAR-11830 (experimental training data)
https://www.ebi.ac.uk/empiar/EMPIAR-11830/
The code is configured with Hydra and logs all runs to Weights & Biases (WandB). To use it, you will need to:
- Download the datasets and update the paths in the configs
- Adjust your WandB configuration to connect to your project
- first experiment (prior comparison)
# Ground truth (GT)
python train_diffusion.py +exps/diffusion=synthetic_volume_patch_prior training.pl_trainer.devices=[0]
# Note: configs below only override source_dir and processed_dir
# (source_dir = HDD path with volumes,
# processed_dir = SSD path where volumes are copied and reformatted before training)
# GT + Var 0.1
python train_diffusion.py +exps/diffusion=synthetic_volume_patch_prior_var0_1 training.pl_trainer.devices=[0]
# GT + Var 0.5
python train_diffusion.py +exps/diffusion=synthetic_volume_patch_prior_var0_5 training.pl_trainer.devices=[0]
# GT + Var 1
python train_diffusion.py +exps/diffusion=synthetic_volume_patch_prior_var1 training.pl_trainer.devices=[0]
# FBP 28°
python train_diffusion.py +exps/diffusion=synthetic_volume_patch_prior_fbp_28 training.pl_trainer.devices=[0]
# FBP 60°
python train_diffusion.py +exps/diffusion=synthetic_volume_patch_prior_fbp training.pl_trainer.devices=[0]
# FBP 92°
python train_diffusion.py +exps/diffusion=synthetic_volume_patch_prior_fbp_92 training.pl_trainer.devices=[0]
# ---
# Projection prior
python train_diffusion.py +exps/diffusion=synth_cropped_proj training.pl_trainer.devices=[0]- second experiment (synthetic benchmark)
python train_diffusion.py +exps/diffusion=synthetic_volume_patch_prior_bin training.pl_trainer.devices=[0]- third experiment (experimental setup)
# CryoLithe data
python train_diffusion.py +exps/diffusion=cryocare_prior training.pl_trainer.devices=[0]
# CryoLithe data + 100 extra volumes
python train_diffusion.py +exps/diffusion=cryocare_prior_plus training.pl_trainer.devices=[0]Reconstruction can be started in multiple ways:
- via experiment configs
- via Hydra sweeps
- via Slurm sweep wrappers (see the sweeps/ folder)
Note: the final configs used for the master thesis results are in the Hydra sweep configs (occasionally overriding some experiment configs without sweeping through them).
Example (last experiment with experimental data):
python reconstruct.py +exps/recon/ma=empiar_10045
python reconstruct.py +sweeps/ma/empiar_10045=plus