This repository contains an implementation of the segmentation task using DenseCLIP without relying on the mmcv module. The model is trained and evaluated on the ADE20K Challenge 2016 dataset.
- DenseCLIP Integration: Utilizes the DenseCLIP model for semantic segmentation.
- No MMCV Dependency: The implementation removes the need for
mmcv, making it easier to run in environments with restricted installation permissions. - ADE20K Dataset: Uses the ADE20K Challenge 2016 dataset for training and evaluation.
- Custom Trainer: Implements a trainer tailored for the segmentation task.
Method: Conda environment
Python Version: 3.8
conda create -n denseclip_pt17_py38 python=3.8 -y
conda activate denseclip_pt17_py38Reason for Older Python/PyTorch: The target server system has GLIBC version 2.17. Newer PyTorch builds require GLIBC >= 2.27. PyTorch 1.7.1 with CUDA 11.0 was found to be compatible with the system's GLIBC.
- PyTorch Version: 1.7.1
- Torchvision Version: 0.8.2
- Torchaudio Version: 0.7.2
- CUDA Toolkit Version: 11.0 (compatible with GLIBC 2.17 and Nvidia Driver 550.x supporting CUDA 12.4)
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=11.0 -c pytorch -c conda-forgeThese were installed using pip within the activated denseclip_pt17_py38 environment:
pip install pyyaml timm==0.9.12 regex ftfy fvcore Pillow scikit-image tensorboard wget numpy==1.24.4 six matplotlib opencv-pythonwget http://data.csail.mit.edu/places/ADEchallenge/ADEChallengeData2016.zip
unzip ADEChallengeData2016.zip~/DenseCLIP/
├── segmentation/
│ ├── ADEChallengeData2016/ <-- Extracted dataset HERE
│ │ ├── annotations/
│ │ ├── images/
│ │ └── ... (other dataset files)
│ ├── configs/
│ ├── datasets/
│ ├── denseclip/
│ ├── work_dirs/
│ └── train_denseclip.py
├── detection/
└── ...
Note: Do NOT add the dataset files to your Git repository. Ensure your .gitignore file includes entries like segmentation/ADEChallengeData2016/ or data/.
To ensure correct dataset usage, edit the configuration file: ~/DenseCLIP/segmentation/configs/denseclip_ade20k.yaml.
Modify the data section:
data:
path: '.' # Point to the current directory (segmentation/)
# ... rest of data configAlso, ensure all pretrained: keys are removed from this YAML file.
Run the training script:
python train_denseclip.py configs/denseclip_ade20k.yaml "--work-dir=work_dirs/pt17_run"Evaluate the model on the validation set:
python evaluate.py --dataset ADE20K --checkpoint path/to/checkpoint.pthThe segmentation model achieves competitive performance on ADE20K without requiring mmcv. Example segmented images are shown below:
(Add sample segmentation results here)
- Enhancing Performance: Experimenting with CoCoOp for improved generalization.
- Additional Datasets: Extending the implementation to other segmentation datasets.
- Custom Prompt Learning: Developing new prompt-based strategies for DenseCLIP.
If you use this repository, please cite the original DenseCLIP paper:
@article{DenseCLIP,
title={DenseCLIP: Extracting Dense Feature Representations from CLIP},
author={Zhang, Haotian and Wu, Qirong and others},
year={2023}
}- The ADE20K dataset: MIT CSAIL
- DenseCLIP: GitHub Repository