Skip to content

DMCB-GIST/ST-ConMa

Repository files navigation

ST-ConMa

Overview

💽 Datasets for downstream tasks


Encoders for ST-ConMa

Image Encoder

Gene Encoder


💽 Pre-processed training datasets, checkpoint and result files


🗂️ Project Structure

ST_ConMa_git/
├── train_multitask.py              # Main multi-task training script
├── run_train_multitask.sh          # Distributed training launcher
│
├── PathoDuet/
│   ├── checkpoint/
│       ├── checkpoint_HE.pth                    
│
├── utils/                          # Core training utilities
│   ├── multimodal.py               # ST_AlignmentModel & Trainer
│   ├── loss.py                     
│   ├── dataset_load.py           
│   ├── augmentations.py           
│   ├── model.py                   
│   ├── optimizer.py                
│   ├── scheduler.py               
│   ├── gather.py                   
│   └── pt_load_inference.py        
│
├── pt_dataset/                     # Pre-training datasets
│   ├── st_images/                  # ST tissue images
│   └── st_sentences/               # Gene sentences
│
├── ft_dataset/                     # Fine-tuning datasets
│   ├── gep_pred/                   # Gene expression prediction
│   ├── spatial_clustering/         # Spatial clustering (DLPFC)
│   └── linear_probing/             # Linear probing benchmarks
│       ├── crc
│       ├── mhist
│       └── luad
│
├── evaluations/                    # Evaluation scripts
│   ├── gep_pred/                   # Gene expression prediction
│   │   ├── eval_st_conma_zeroshot.py
│   │   ├── train_st_conma.py
│   │   └── run_st_conma_*.sh
│   ├── linear_probing/             # Linear probing tasks
│   │   ├── cam17.py
│   │   ├── crcnorm.py
│   │   ├── luad.py
│   │   └── mhist.py
│   ├── spatial_clustering/         # Spatial domain identification
│   ├── st_linear_probing/          # ST-specific linear probing
│   └── cluster_plot/               # Visualization utilities
│
├── checkpoints/                    # Saved model weights
└── results/                        # Evaluation results

🌏 Environment Setup

Two conda environments are required:

conda env create -f st_conma_env.yml
conda activate st_conma
pip install --no-deps --extra-index-url https://download.pytorch.org/whl/cu124 -r st_conma_pip.txt

# For spatial clustering with STAIG module
conda env create -f st_conma_clustering_env.yml
conda activate st_conma_clustering
pip install --no-deps -f https://data.pyg.org/whl/torch-2.2.0+cu121.html -r st_conma_clustering_pip.txt

🚀 Dataset Preparation

1. Download Pre-training and downstream datasets

We have uploaded pre-training, downstream datasets and result files on Google Drive. Access requests will be approved as quickly as possible.

Histopathology benchmark datasets can be downloaded from the links mentioned previously.


Pre-training

conda activate st_conma
bash run_train_multitask.sh

Evaluation

Note: Adjust the CUDA device in each script as needed.

K-means Clustering on Histopathology Benchmarks

conda activate st_conma
python ./evaluations/cluster_plot/cluster_plot_st_conma.py

Linear Probing on Histopathology Benchmarks

conda activate st_conma
python ./evaluations/linear_probing/cam17.py
python ./evaluations/linear_probing/crcnorm.py
python ./evaluations/linear_probing/luad.py
python ./evaluations/linear_probing/mhist.py

Gene Expression Prediction

Evaluated on HER2ST, cSCC, and HLT datasets.

conda activate st_conma
bash ./evaluations/gep_pred/run_st_conma_igc_igm.sh
bash ./evaluations/gep_pred/run_st_conma_zeroshot.sh

For visualization, run all cells in:

  • ./evaluations/gep_pred/CD24_viz.ipynb
  • ./evaluations/gep_pred/LGALS1_viz.ipynb

Linear Probing on DLPFC Dataset

conda activate st_conma
python ./evaluations/st_linear_probing/train_st_conma_ie_dlpfc.py
python ./evaluations/st_linear_probing/train_st_conma_ge_dlpfc.py
python ./evaluations/st_linear_probing/train_st_conma_fe_dlpfc.py

Spatial Clustering

Evaluated on DLPFC and Human Breast Cancer datasets.

# Fine-tuning
conda activate st_conma
bash ./evaluations/spatial_clustering/run_finetune_dlpfc.sh
bash ./evaluations/spatial_clustering/run_finetune_hbc.sh

# Extract fusion embeddings
bash ./evaluations/spatial_clustering/run_get_fusion_embeddings_dlpfc.sh
bash ./evaluations/spatial_clustering/run_get_fusion_embeddings_hbc.sh

# Clustering with STAIG module
conda activate st_conma_clustering
python ./evaluations/spatial_clustering/train_st_conma.py \
    --dataset dlpfc --all \
    --output_dir ./results/spatial_clustering/st_conma/dlpfc

python ./evaluations/spatial_clustering/train_st_conma.py \
    --dataset human_breast_cancer --all \
    --output_dir ./results/spatial_clustering/st_conma/human_breast_cancer

Human Breast Cancer Analysis

conda activate st_conma and run all cells in ./evaluations/spatial_clustering/spatial_clustering.ipynb.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors