Official PyTorch implementation for LF Disparity Estimation of the IEEE TPAMI 2026 paper: "Diving into Epipolar Transformers for Light Field Super-Resolution and Disparity Estimation"
BasicLFDisp is a PyTorch-based open-source and easy-to-use toolbox for Light Field (LF) image Disparity Estimation. This toolbox introduces a simple pipeline to train/test your methods, and builds a benchmark to comprehensively evaluate the performance of existing methods. Our BasicLFDisp can help researchers to get access to LF image disparity estimation quickly, and facilitates the development of novel methods. Welcome to contribute your own methods to the benchmark.
- [2026-03] 🎉 Our paper **"Diving into Epipolar Transformers for Light Field Super-Resolution and Disparity Estimation" has been accepted by *IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)!
- [2026-03] 🚀 We have released the BasicLFDisp toolbox, including the pre-trained models for our EPIT-Disp mechanism.
- We provide a PyTorch-based open-source and easy-to-use toolbox for LF image disparity estimation.
- We re-implement a number of existing methods on the unified datasets, and develop a benchmark for performance evaluation.
- We share the codes, models, and results of existing methods to help researchers better get access to this area.
git clone https://github.com/ZhengyuLeung/BasicLFDisp.git We used the HCInew dataset for both training and test, and place the dataset to the folder ./datasets/.
Run Generate_Data_for_Disp_Training.py to generate training data. The generated data will be saved in ./data_for_training/.
Run Generate_Data_for_Disp_Test.py to generate test data. The generated data will be saved in ./data_for_test/.
Modify the configs in train_Disp.py or use default arguments.
Checkpoints and Logs will be saved to ./log/, and the ./log/ has the following structure:
log/
├── Disp_9x9
│ ├── [model_name]_Disp
│ │ ├── [model_name]_Disp.txt
│ │ ├── checkpoints
│ │ │ ├── [model_name]_Disp_9x9_Disp_epoch_01_model.pth
│ │ │ ├── [model_name]_Disp_9x9_Disp_epoch_02_model.pth
│ │ │ └── ...
│ │ ├── results
│ │ │ ├── VAL_epoch_01
│ │ │ ├── VAL_epoch_02
│ │ │ └── ...
│ │ ├── [other_model_name]
│ │ └── ...
│ ├── [other_model_name]
│ └── ...
└── ...
Run test_Disp.py to perform network inference.
The PSNR and SSIM values of each dataset will be saved to ./log/, and the ./log/ has the following structure:
log/
├── Disp_9x9
│ ├── [model_name]_Disp
│ │ ├── [model_name]_log.txt
│ │ ├── checkpoints
│ │ ├── results
│ │ │ ├── Test
│ │ │ │ ├── evaluation_Disp_[model_name]_Disp.xlsx
│ │ │ │ ├── [dataset_name]_Disp
│ │ │ │ │ ├── [scene_1_name]
│ │ │ │ │ │ ├── backgammon.bmp
│ │ │ │ │ │ ├── backgammon_BP01.bmp
│ │ │ │ │ │ ├── backgammon_BP03.bmp
│ │ │ │ │ │ ├── backgammon_BP07.bmp
│ │ │ │ │ │ ├── backgammon_error.bmp
│ │ │ │ │ │ └── backgammon.pfm
│ │ │ │ │ ├── [scene_2_name]
│ │ │ │ │ └── ...
│ │ │ │ └── ...
│ │ │ └── ...
│ │ └── ...
│ ├── [other_model_name]
│ └── ...
└── ...
We benchmark several methods on the above datasets. MSE and BP_07, BP_03, BP_01 metrics are used for quantitative evaluation.
To obtain the metric score for a dataset with M scenes, we obtain the score for this dataset by averaging the scores of all its M scenes.
The definitions and meanings of the metrics are as follows:
- MSE_100: The result of multiplying Mean Squared Error (MSE) by 100. It is used to measure the average squared difference between predicted disparity and ground-truth disparity.
- BP_07: Bad Pixel Rate (BP) with an error threshold of 0.07, i.e., the percentage of pixels where the absolute error between predicted disparity and ground-truth disparity is ≥ 0.07.
- BP_03: BP with an error threshold of 0.03, i.e., the percentage of pixels where the absolute error between predicted disparity and ground-truth disparity is ≥ 0.03.
- BP_01: BP with an error threshold of 0.01, i.e., the percentage of pixels where the absolute error between predicted disparity and ground-truth disparity is ≥ 0.01.
📊 Click to expand benchmark results
| Methods | #Params. | MSE_100 | BP_07 | BP_03 | BP_01 |
|---|---|---|---|---|---|
| AttMLFNet | 1.547M | 12.862 | 18.949 | 30.396 | 57.802 |
| DistgDisp | 7.982M | 2.412 | 7.379 | 14.506 | 37.192 |
| EPI_ORM | 5.097M | 11.573 | 34.763 | 64.211 | 87.445 |
| Epinet | 11.02M | 8.843 | 44.867 | 69.069 | 88.661 |
| LFattNet | 1.753M | 27.162 | 30.622 | 49.776 | 78.612 |
| OACC_Net | 5.018M | 4.495 | 12.124 | 20.091 | 46.594 |
| EPIT_Disp | 1.206M | 3.802 | 15.463 | 28.266 | 59.825 |
| EPIT_Disp_C128 | 1.206M | 4.335 | 15.985 | 28.776 | 58.415 |
- The pre-trained models of the aforementioned methods can be downloaded via this link.
If you find this code or our paper useful for your research, please consider citing:
@article{EPIT2026,
title = {Diving into Epipolar Transformers for Light Field Super-Resolution and Disparity Estimation},
author = {Liang, Zhengyu and Wang, Yingqian and Wang, Longguang and Yang, Jungang and Guo, Yulan and Liu, Li and Zhou, Shilin and An, Wei},
journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)},
year = {2026},
}