Skip to content

Semantic Segmentation PyTorch code for our paper: Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition (https://arxiv.org/pdf/2006.11538.pdf)

License

Notifications You must be signed in to change notification settings

iduta/pyconvsegnet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pyramidal Convolution on semantic segmentation

This is the PyTorch implementation of our paper "Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition". (Note that this is the code for semantic image segmentation/parsing. For image recognition on ImageNet see this repository: https://github.com/iduta/pyconv)

Pyramidal Convolution Segmentation Network: PyConvSegNet

The models trained on ADE20K dataset can be found here.

The results on the ADE20K validation set of our PyConvSegNet (using multi-scale inference):

Backbone mean IoU pixel Acc.
ResNet-50 42.88% 80.97% (model)
PyConvResNet-50 43.31% 81.18% (model)
ResNet-101 44.39% 81.60% (model)
PyConvResNet-101 44.58% 81.77% (model)
ResNet-152 45.28% 81.89% (model)
PyConvResNet-152 45.64% 82.36% (model)

Our single model top result (mIoU=39.13, pAcc=73.91, score=56.52) on the testing set is obtained with PyConvResNet-152 as backbone and performing the training on train+val sets over 120 epochs (model).

Requirements

Install PyTorch pip install -r requirements.txt

A fast alternative (without the need to install PyTorch and other deep learning libraries) is to use NVIDIA-Docker, we used this container image.

Download the ImageNet pretrained models and add the corresponding path to the config .yaml file.

Download the ADE20K dataset. (note that this code uses label id starting from 0, while the original ids start from 1, thus, you need to preprocess the original labels by subtracting 1)

Training and Inference

To train a model on ADE20K dataset, for instance, using PyConvResNet with 50 layers as backbone (note that you need to update the config file, for instance, config/ade20k/pyconvresnet50_pyconvsegnet.yaml):

./tool/train.sh ade20k pyconvresnet50_pyconvsegnet

Run the inference on the validation set (also update the config/ade20k/pyconvresnet50_pyconvsegnet.yaml file for the TEST part):

./tool/test.sh ade20k pyconvresnet50_pyconvsegnet

Citation

If you find our work useful, please consider citing:

@article{duta2020pyramidal,
  author  = {Ionut Cosmin Duta and Li Liu and Fan Zhu and Ling Shao},
  title   = {Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition},
  journal = {arXiv preprint arXiv:2006.11538},
  year    = {2020},
}

Acknowledgements

This code is based on this repository. We thank the authors for open-sourcing their code.