Skip to content

Rankings include: Align3R BetterDepth Buffer Anytime ChronoDepth CUT3R Deep3D Depth Any Video Depth Anything Depth Pro DepthCrafter Diffusion E2E FT FutureDepth GRIN Metric3D MoGe MonST3R NVDS NVDS+ PatchFusion RollingDepth StereoCrafter UniDepth Video Depth Anything ZeroDepth ZoeDepth

Notifications You must be signed in to change notification settings

AIVFI/Monocular-Depth-Estimation-Rankings-and-2D-to-3D-Video-Conversion-Rankings

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

74 Commits
 
 

Repository files navigation

Monocular Depth Estimation Rankings
and 2D to 3D Video Conversion Rankings

List of Rankings

2D to 3D Video Conversion Rankings

  1. Qualitative comparison of four 2D to 3D video conversion methods: Rank (human perceptual judgment)

Monocular Depth Estimation Rankings

I. Rankings based on temporal consistency metrics

  1. ScanNet (170 frames): TAE<=2.2
  2. Bonn RGB-D Dynamic (5 video clips with 110 frames each): OPW<=0.1
  3. ScanNet++ (98 video clips with 32 frames each): TAE
  4. NYU-Depth V2: OPW<=0.37

II. Rankings based on 3D metrics

  1. Direct comparison of 9 metric depth models (each with each) on 5 datasets: F-score

III. Rankings based on 2D metrics

  1. Bonn RGB-D Dynamic (5 video clips with 110 frames each): AbsRel<=0.079
  2. NYU-Depth V2: AbsRel<=0.045 (relative depth)
  3. NYU-Depth V2: AbsRel<=0.051 (metric depth)

Appendices


Qualitative comparison of four 2D to 3D video conversion methods: Rank (human perceptual judgment)

📝 Note: There are no quantitative comparison results of StereoCrafter yet, so this ranking is based on my own perceptual judgement of the qualitative comparison results shown in Figure 7. One output frame (right view) is compared with one input frame (left view) from the video clip: 22_dogskateboarder and one output frame (right view) is compared with one input frame (left view) from the video clip: scooter-black

RK Model
Links:
         Venue   Repository    
Rank ↓
(human perceptual
judgment)
1 StereoCrafter
arXiv GitHub Stars
1
2-3 Immersity AI 2-3
2-3 Owl3D 2-3
4 Deep3D
ECCV GitHub Stars
4

Back to Top Back to the List of Rankings

ScanNet (170 frames): TAE<=2.2

RK Model
Links:
         Venue   Repository    
  TAE ↓  
{Input fr.}
arXiv
VDA
1 VDA-L
arXiv GitHub Stars
0.570 {MF}
2 DepthCrafter
arXiv GitHub Stars
0.639 {MF}
3 Depth Any Video
arXiv GitHub Stars
0.967 {MF}
4 ChronoDepth
arXiv GitHub Stars
1.022 {MF}
5 Depth Anything V2 Large
NeurIPS GitHub Stars
1.140 {1}
6 NVDS
ICCV GitHub Stars
2.176 {4}

Back to Top Back to the List of Rankings

Bonn RGB-D Dynamic (5 video clips with 110 frames each): OPW<=0.1

RK Model
Links:
         Venue   Repository    
  OPW ↓  
{Input fr.}
arXiv
BA
1 Buffer Anytime (DA V2)
arXiv
0.028 {MF}
2 DepthCrafter
arXiv GitHub Stars
0.029 {MF}
3 ChronoDepth
arXiv GitHub Stars
0.035 {MF}
4 Marigold + E2E FT
WACV GitHub Stars
0.053 {1}
5 Depth Anything V2 Large
NeurIPS GitHub Stars
0.059 {1}
6 NVDS
ICCV GitHub Stars
0.068 {4}

Back to Top Back to the List of Rankings

ScanNet++ (98 video clips with 32 frames each): TAE

RK Model
Links:
         Venue   Repository    
  TAE ↓  
{Input fr.}
arXiv
DAV
1 Depth Any Video
arXiv GitHub Stars
2.1 {MF}
2 DepthCrafter
arXiv GitHub Stars
2.2 {MF}
3 ChronoDepth
arXiv GitHub Stars
2.3 {MF}
4 NVDS
ICCV GitHub Stars
3.7 {4}

Back to Top Back to the List of Rankings

NYU-Depth V2: OPW<=0.37

RK Model
Links:
         Venue   Repository    
  OPW ↓  
{Input fr.}
ECCV
FD
   OPW ↓   
{Input fr.}
TPAMI
NVDS+
  OPW ↓  
{Input fr.}
ICCV
NVDS
1 FutureDepth
ECCV
0.303 {4} - -
2 NVDS+
TPAMI GitHub Stars
- 0.339 {4} -
3 NVDS
ICCV GitHub Stars
0.364 {4} - 0.364 {4}

Back to Top Back to the List of Rankings

Direct comparison of 9 metric depth models (each with each) on 5 datasets: F-score

📝 Note: This ranking is based on data from Table 4. The example score 3:0:2 (first left in the first row) means that Depth Pro has a better F-score than UniDepth-V in 3 datasets, in no dataset has the same F-score as UniDepth-V and has a worse F-score compared to UniDepth-V in 2 datasets.

RK Model
Links:
         Venue   Repository    
DP UD M3D v2 DA V2 DA ZoeD M3D PF ZD
1 Depth Pro
arXiv GitHub Stars
- 3:0:2 3:1:1 5:0:0 5:0:0 5:0:0 5:0:0 5:0:0 3:0:0
2 UniDepth-V
CVPR GitHub Stars
2:0:3 - 4:0:1 5:0:0 5:0:0 5:0:0 5:0:0 5:0:0 3:0:0
3 Metric3D v2 ViT-giant
TPAMI GitHub Stars
1:1:3 1:0:4 - 4:1:0 5:0:0 5:0:0 5:0:0 5:0:0 3:0:0
4 Depth Anything V2
NeurIPS GitHub Stars
0:0:5 0:0:5 0:1:4 - 4:1:0 4:0:1 5:0:0 4:0:1 3:0:0
5 Depth Anything
CVPR GitHub Stars
0:0:5 0:0:5 0:0:5 0:1:4 - 3:0:2 3:1:1 3:0:2 2:1:0
6 ZoeD-M12-NK
arXiv GitHub Stars
0:0:5 0:0:5 0:0:5 1:0:4 2:0:3 - 3:0:2 3:1:1 2:0:1
7 Metric3D
ICCV GitHub Stars
0:0:5 0:0:5 0:0:5 0:0:5 1:1:3 2:0:3 - 3:0:2 2:1:0
8 PatchFusion
CVPR GitHub Stars
0:0:5 0:0:5 0:0:5 1:0:4 2:0:3 1:1:3 2:0:3 - 2:0:1
9 ZeroDepth
ICCV GitHub Stars
0:0:3 0:0:3 0:0:3 0:0:3 0:1:2 1:0:2 0:1:2 1:0:2 -

Back to Top Back to the List of Rankings

Bonn RGB-D Dynamic (5 video clips with 110 frames each): AbsRel<=0.079

📝 Note: 1) See Figure 4 2) The ranking order is determined in the first instance by a direct comparison of the scores of two models in the same paper. If there is no such direct comparison in any paper or there is a disagreement in different papers, the ranking order is determined by the best score of the compared two models in all papers that are shown in the columns as data sources. The DepthCrafter rank is based on the latest version 1.0.1.

RK Model
Links:
         Venue   Repository    
  AbsRel ↓  
{Input fr.}
arXiv
VDA
  AbsRel ↓  
{Input fr.}
arXiv
Align3R
  AbsRel ↓  
{Input fr.}
arXiv
MonST3R
  AbsRel ↓  
{Input fr.}
arXiv
DC
  AbsRel ↓  
{Input fr.}
arXiv
CUT3R
  AbsRel ↓  
{Input fr.}
arXiv
RD
1 Depth Any Video
arXiv GitHub Stars
0.051 {MF} - - - - -
2 VDA-L
arXiv GitHub Stars
0.053 {MF} - - - - -
3 Depth Pro
arXiv GitHub Stars
- 0.067 {1} - - - -
4 Align3R (Depth Pro)
arXiv GitHub Stars
- 0.068 {2} - - - -
5 MonST3R
arXiv GitHub Stars
- 0.082 {2} 0.063 {2} - 0.066 {2} -
6 DepthCrafter v1.0.1
arXiv GitHub Stars
0.066 {MF}
(DC v1.0.0)
0.075 {MF}
(DC v1.0.0)
0.075 {MF}
(DC v1.0.0)
0.071 {MF} 0.075 {MF}
(DC v1.0.0)
0.066 {MF}
(DC v1.0.0)
7 CUT3R
arXiv GitHub Stars
- - - - 0.074 {MF} -
8 RollingDepth
arXiv GitHub Stars
- - - - - 0.079 {MF}
9 Depth Anything
CVPR GitHub Stars
- - - 0.078 {1} - 0.099 {1}

Back to Top Back to the List of Rankings

NYU-Depth V2: AbsRel<=0.045 (relative depth)

RK Model
Links:
         Venue   Repository    
  AbsRel ↓  
{Input fr.}
arXiv
MoGe
  AbsRel ↓  
{Input fr.}
arXiv
BD
   AbsRel ↓   
{Input fr.}
TPAMI
M3D v2
  AbsRel ↓  
{Input fr.}
CVPR
DA
    AbsRel ↓    
{Input fr.}
NeurIPS
DA V2
1 MoGe
arXiv GitHub Stars
0.0341 {1} - - - -
2 UniDepth
CVPR GitHub Stars
0.0380 {1} - - - -
3-4 BetterDepth
arXiv
- 0.042 {1} - - -
3-4 Metric3D v2 ViT-Large
TPAMI GitHub Stars
0.134 {1} - 0.042 {1} - -
5 Depth Anything Large
CVPR GitHub Stars
0.0424 {1} 0.043 {1} 0.043 {1} 0.043 {1} 0.043 {1}
6 Depth Anything V2 Large
NeurIPS GitHub Stars
0.0420 {1} - - - 0.045 {1}

Back to Top Back to the List of Rankings

NYU-Depth V2: AbsRel<=0.051 (metric depth)

RK Model
Links:
         Venue   Repository    
   AbsRel ↓   
{Input fr.}
TPAMI
M3D v2
  AbsRel ↓  
{Input fr.}
arXiv
GRIN
- - -
1 Metric3D v2 ViT-giant
TPAMI GitHub Stars
0.045 {1} - - - -
2 GRIN_FT_NI
arXiv
- 0.051 {1} - - -

Back to Top Back to the List of Rankings

Appendix 3: List of all research papers from the above rankings

Method Abbr. Paper     Venue    
(Alt link)
Official
  repository  
Align3R - Align3R: Aligned Monocular Depth Estimation for Dynamic Videos arXiv GitHub Stars
BetterDepth BD BetterDepth: Plug-and-Play Diffusion Refiner for Zero-Shot Monocular Depth Estimation arXiv -
Buffer Anytime BA Buffer Anytime: Zero-Shot Video Depth and Normal from Image Priors arXiv -
ChronoDepth - Learning Temporally Consistent Video Depth from Video Diffusion Priors arXiv GitHub Stars
CUT3R - Continuous 3D Perception Model with Persistent State arXiv GitHub Stars
Deep3D - Deep3D: Fully Automatic 2D-to-3D Video Conversion with Deep Convolutional Neural Networks ECCV GitHub Stars
Depth Any Video DAV Depth Any Video with Scalable Synthetic Data arXiv GitHub Stars
Depth Anything DA Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data CVPR GitHub Stars
Depth Anything V2 DA V2 Depth Anything V2 NeurIPS GitHub Stars
Depth Pro DP Depth Pro: Sharp Monocular Metric Depth in Less Than a Second arXiv GitHub Stars
DepthCrafter DC DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos arXiv GitHub Stars
Diffusion E2E FT E2E FT Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think WACV GitHub Stars
FutureDepth FD FutureDepth: Learning to Predict the Future Improves Video Depth Estimation ECCV -
GRIN - GRIN: Zero-Shot Metric Depth with Pixel-Level Diffusion arXiv -
Metric3D M3D Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image ICCV GitHub Stars
Metric3D v2 M3D v2 Metric3D v2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation TPAMI
(Alt link)
GitHub Stars
MoGe - MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision arXiv GitHub Stars
MonST3R - MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion arXiv GitHub Stars
NVDS - Neural Video Depth Stabilizer ICCV GitHub Stars
NVDS+ - NVDS+: Towards Efficient and Versatile Neural Stabilizer for Video Depth Estimation TPAMI
(Alt link)
GitHub Stars
PatchFusion PF PatchFusion: An End-to-End Tile-Based Framework for High-Resolution Monocular Metric Depth Estimation CVPR GitHub Stars
RollingDepth RD Video Depth without Video Models arXiv GitHub Stars
StereoCrafter - StereoCrafter: Diffusion-based Generation of Long and High-fidelity Stereoscopic 3D from Monocular Videos arXiv GitHub Stars
UniDepth UD UniDepth: Universal Monocular Metric Depth Estimation CVPR GitHub Stars
Video Depth Anything VDA Video Depth Anything: Consistent Depth Estimation for Super-Long Videos arXiv GitHub Stars
ZeroDepth ZD Towards Zero-Shot Scale-Aware Monocular Depth Estimation ICCV GitHub Stars
ZoeDepth ZoeD ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth arXiv GitHub Stars

Back to Top Back to the List of Rankings

About

Rankings include: Align3R BetterDepth Buffer Anytime ChronoDepth CUT3R Deep3D Depth Any Video Depth Anything Depth Pro DepthCrafter Diffusion E2E FT FutureDepth GRIN Metric3D MoGe MonST3R NVDS NVDS+ PatchFusion RollingDepth StereoCrafter UniDepth Video Depth Anything ZeroDepth ZoeDepth

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published