Skip to content

cosmoimd/temporal_segmentation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

4f046a7 Β· Apr 24, 2025

History

1 Commit
Apr 24, 2025
Apr 24, 2025
Apr 24, 2025
Apr 24, 2025
Apr 24, 2025

Repository files navigation

Temporal Segmentation of Full-Procedure Colonoscopy Videos

Overview

This repository accompanies the paper "A Temporal Convolutional Network-Based Approach and a Benchmark Dataset for Colonoscopy Video Temporal Segmentation" [1]. It provides the implementation of ColonTCN, a Temporal Convolutional Network-based approach for segmenting colonoscopy videos into anatomical sections and procedural phases. The project leverages a benchmark dataset derived from the annotated REAL-Colon (RC) dataset, which features 2.7 million frames across 60 full-procedure videos, and proposed two k-fold validation splits and metrics to evaluate model performance.

Detailed Temporal Segmentation Visualization

Getting Started

Clone the repository and set up a virtual environment

git clone https://github.com/YOUR_USERNAME/temporal_segmentation.git  
cd temporal_segmentation  
python -m venv venv && source venv/bin/activate  # On macOS/Linux  
venv\Scripts\activate  # On Windows  

Install the necessary dependencies from the requirements.txt file:

pip install -r requirements.txt

REAL-Colon Temporal Segmentation Benchmark

The benchmark dataset used in this project is the REAL-Colon (RC) dataset [2]. Click here for instructions on automatically downloading, extracting, and preparing data splits for benchmarking temporal segmentation models.

ColonTCN

The pretrained ColonTCN models obtained in [1] are available at the following link for both the 4-fold and 5-fold scenario:

πŸ”— Google Drive – ColonTCN Checkpoints

To use them, download the entire folder and place the contents into: experiments/model/. Then, run:

CUDA_VISIBLE_DEVICES=0 python3 src/test_shared_model.py -parFile ymls/inference/test_shared_4fold_colontcn.yml
CUDA_VISIBLE_DEVICES=0 python3 src/test_shared_model.py -parFile ymls/inference/test_shared_5fold_colontcn.yml

Model Training

Models are trained in a 4-fold or 5-fold setting on RC using the following command and specific configuration files for each fold.

CUDA_VISIBLE_DEVICES=0 python src/training.py -parFile ymls/training/colontcn_4fold/training_colontcn_4fold_fold1.yml

All configuration files for training a ColonTCN model in the 4-fold or 5-fold setting are reported at:

ymls/training/colontcn_4fold/
ymls/training/colontcn_5fold/

Automated Model Evaluation on the RC Benchmark

To test models in the 4-fold or 5-fold setting src/training.py on RC using the following command and specific configuration files for each fold.

CUDA_VISIBLE_DEVICES=0 python3 src/inference_testing_on_folds.py -parFile ymls/inference/inference_testing_4fold_colontcn.yml
CUDA_VISIBLE_DEVICES=0 python3 src/inference_testing_on_folds.py -parFile ymls/inference/inference_testing_5fold_colontcn.yml

Model Profiling

To profile a model for its computational efficiency such as inference time and memory usage.

CUDA_VISIBLE_DEVICES=0 python src/profiling.py --config ymls/profiling/colontcn_4fold.yml
CUDA_VISIBLE_DEVICES=0 python src/profiling.py --config ymls/profiling/colontcn_5fold.yml

Project Structure

The following is an overview of the repository structure.
Files and directories marked as "(ignored)" are not included in the repository due to .gitignore.

β”œβ”€β”€ data/  
β”‚   β”œβ”€β”€ create_embeddings_datasets.py  # Script to embed RC videos into video latent representations using a frame encoder
β”‚   β”œβ”€β”€ dataset/  
β”‚   β”‚   β”œβ”€β”€ RC_annotation/  # RC dataset annotations (CSVs) released with this work (ignored) 
β”‚   β”‚   β”œβ”€β”€ RC_dataset/  # Raw RC dataset downloaded from Figshare (ignored) 
β”‚   β”‚   β”œβ”€β”€ RC_embedded_dataset/  # RC dataset videos embedded with a frame encoder (ignored) 
β”‚   β”‚   β”œβ”€β”€ RC_lists/  # Fold-based data splits (4-fold and 5-fold) for model benchmarking  
β”‚   β”œβ”€β”€ images/  # Images used in the repository (e.g., visualizations, results)
β”‚   β”œβ”€β”€ ymls/  # YAML config files for dataset processing
β”‚   β”œβ”€β”€ README.md  # Documentation for the `data/` directory
β”œβ”€β”€ experiments/  
β”‚   β”œβ”€β”€ outputs/  # Output training folders and Inference/testing results (ignored) 
    β”œβ”€β”€ models/  # ColonTCN models proposed in [1]  (ignored)  
    β”œβ”€β”€ temp_datasets/  # Folder where to save temp datasets to speed up training and testing (ignored)  
β”‚   β”œβ”€β”€ visualizations/  # Output visualizations (ignored)  
β”œβ”€β”€ src/  # Main source code directory
β”‚   β”œβ”€β”€ data_loader/
β”‚   β”‚   β”œβ”€β”€ embeddings_dataset.py  # Data loader for embedding-based datasets
β”‚   β”œβ”€β”€ feature_extraction/
β”‚   β”‚   β”œβ”€β”€ feature_extraction.py  # Feature extraction module for processing RC videos
β”‚   β”‚   β”œβ”€β”€ frame_classification_model.py  # Frame-wise classification model
β”‚   β”‚   β”œβ”€β”€ video_loader.py  # Handles video file reading and frame extraction
β”‚   β”‚   └── ymls/  # YAML config files for feature extraction
β”‚   β”‚       β”œβ”€β”€ feature_extraction_1x_RC.yml
β”‚   β”‚       β”œβ”€β”€ feature_extraction_5x_aug_RC.yml
β”‚   β”œβ”€β”€ inference.py  # Script for performing inference on the trained model
β”‚   β”œβ”€β”€ inference_testing_on_folds.py  # Script for testing inference across multiple data folds
β”‚   β”œβ”€β”€ models/
β”‚   β”‚   β”œβ”€β”€ colontcn.py  # Implementation of the Colontcn model
β”‚   β”‚   β”œβ”€β”€ factory.py  # Model factory for loading different architectures
β”‚   β”‚   β”œβ”€β”€ layers.py  # Custom model layers
β”‚   β”œβ”€β”€ optimizers/
β”‚   β”‚   β”œβ”€β”€ builders.py  # Optimizer builder functions
β”‚   β”‚   β”œβ”€β”€ losses.py  # Loss functions for training
β”‚   β”œβ”€β”€ profiling.py  # Profiling script to analyze performance
β”‚   β”œβ”€β”€ testing.py  # Unit tests for model evaluation
β”‚   β”œβ”€β”€ training.py  # Main training script
β”‚   └── utils/
β”‚       β”œβ”€β”€ io.py  # Utility functions for file I/O operations
β”œβ”€β”€ .gitignore  # Specifies ignored files for version control  
β”œβ”€β”€ README.md  # Main project documentation
β”œβ”€β”€ ymls/  # Folder containing Training/Testing/Profiling config files

References

If you find the work of this repository useful, please consider to cite in your work:

[1] Biffi, C., Roffo, G., Salvagnini, P., & Cherubini, A. (2025). A Temporal Convolutional Network-Based Approach and a Benchmark Dataset for Colonoscopy Video Temporal Segmentation. arXiv preprint arXiv:2502.03430.
[2] Biffi, C., Antonelli, G., Bernhofer, S., Hassan, C., Hirata, D., Iwatate, M., Maieron, A., Salvagnini, P., & Cherubini, A. (2024). REAL-Colon: A dataset for developing real-world AI applications in colonoscopy. Scientific Data, 11(1), 539. https://doi.org/10.1038/s41597-024-03359-0

Contact

For any inquiries, please open an issue in this repository or write at [email protected]