CANVAS

Official repository for Characterization of tumor heterogeneity through segmentation-free representation learning on multiplexed imaging data

The following are the instructions for running the CANVAS analysis framework for multiplexed images.

Google Colab walkthrough. Note that this is only used for debugging purposes due to limited run time and storage on colab. We recommend conduct training and analysis on a local machine with GPUs.

Overview:

CANVAS pipeline requires 4 steps:

Preprocess Images: this step converts raw images IMC/tiff or other images into zarr format and specify the channels of interest.
Train CANVAS: this step trains the CANVAS model with the zarr data and save the trained model.
Extract features with trained model: this step extracts features from the trained model and save the features.
Run downstream analysis: this step runs the downstream analysis with the extracted features.

Step 1: Preprocess Images

Follow instruction located at canvas/preprocess

Provide the root of the config files path

Organize the data with the following directory structure, each image file (.ext) should have a corresponding channel text file (.txt).

You need a common channel text file to limit the channels for only those used in the analysis. This common_channels.txt file should be placed under the '<data_root>/raw_data' directory.

dataset_root
    - config_root
        - config.yaml # Config file for the dataset
        - preprocess
            - selected_channels_w_color.yaml # List of channels and colors to be used for visualization
            - channels_vis_strength.yaml # Specify the color strength for each channel
    data_root
        - raw_data
            - common_channels.txt # List of all channels in the dataset
            - image_files
                - image1.ext
                - image1.txt # Channel file for each image file, each line is a channel name
                - image2.ext
                ...
        - processed_data
        - model_ckpt
        - analysis

Example of structured data set is on zenodo: https://zenodo.org/records/14226759

Run preprocessing function: python run_preprocess.py --config_root <config_root_path> --data_root <data_root_path>

Example:

python run_preprocess.py --config_root /home/epoch/Documents/Jimin/CANVAS_v2/config_files --data_root /home/epoch/Documents/Jimin/CANVAS_v2_data

Overall directory structure should be like the following:

data_root
    - raw_data
        - common_channels.txt # List of all channels in the dataset
        - image_files
            - image1.ext
            - image1.txt # Channel file for each image file, each line is a channel name
            - image2.ext
            ...
        - dummy_input
            - image1_acquisition_0.dummy_ext # The dummy file is used as reference for the zarr samples
            - image2_acquisition_0.dummy_ext
            ...
    - processed_data
        - data
        - qc
    - model_ckpt
        - ckpts
        - log_dir

Things to check:

Sample image visualization: data_root/processed_data/data/<image_name>/visualization/sample.png.
Check the normalized image intensity distribution and check if it is in a reasonable range (-5 to 30 are normal).

Step 2: Train CANVAS

You need a GPU for step 2 training and 3 inference.

Note: if this doesn't work the first time, try running it again. It is generating common channels.

Run training script: python run_training.py --config_root <config_root_path> --data_root <data_root_path> --epoch <epochs>

Step 3: Inference and feature extraction using trained model

Adjust hyperparameters and data directories in canvas/inference/infer.py and run infer.py.

Run python run_inference.py --config_root <config_root_path> --data_root <data_root_path> --ckpt_num <ckpt_num> to extract features using a specific checkpoint.

This script will also run UMAP and KMeans clustering to generate initial clusters.

Step 4: Run downstream analysis

The analysis python script is located at canvas/analysis/main.py. Functions can be slected to run only desired analysis.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
canvas		canvas
docs		docs
.gitignore		.gitignore
AUTHORS.rst		AUTHORS.rst
CONTRIBUTING.rst		CONTRIBUTING.rst
HISTORY.rst		HISTORY.rst
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
env.yaml		env.yaml
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CANVAS

Overview:

Step 1: Preprocess Images

Step 2: Train CANVAS

Step 3: Inference and feature extraction using trained model

Step 4: Run downstream analysis

About

Uh oh!

Releases 1

Packages

Uh oh!

Languages

tanjimin/CANVAS

Folders and files

Latest commit

History

Repository files navigation

CANVAS

Overview:

Step 1: Preprocess Images

Step 2: Train CANVAS

Step 3: Inference and feature extraction using trained model

Step 4: Run downstream analysis

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Languages

Packages