Related papers: Evaluating Robustness of Monocular Depth Estimation with Procedural Scene Perturbations

Evaluating Robustness of Monocular Depth Estimation with Procedural Scene Perturbations

URL: http://arxiv.org/abs/2507.00981v2
Date: Wed, 02 Jul 2025 18:22:59 GMT
Title: Evaluating Robustness of Monocular Depth Estimation with Procedural Scene Perturbations
Authors: Jack Nugent, Siyang Wu, Zeyu Ma, Beining Han, Meenal Parakh, Abhishek Joshi, Lingjie Mei, Alexander Raistrick, Xinyuan Li, Jia Deng,
Abstract summary: We introduce PDE, a new benchmark which enables systematic robustness evaluation.<n>PDE uses procedural generation to create 3D scenes that test robustness to various controlled perturbations.<n>Our analysis yields interesting findings on what perturbations are challenging for state-of-the-art depth models.
Score: 55.4735586739093
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent years have witnessed substantial progress on monocular depth estimation, particularly as measured by the success of large models on standard benchmarks. However, performance on standard benchmarks does not offer a complete assessment, because most evaluate accuracy but not robustness. In this work, we introduce PDE (Procedural Depth Evaluation), a new benchmark which enables systematic robustness evaluation. PDE uses procedural generation to create 3D scenes that test robustness to various controlled perturbations, including object, camera, material and lighting changes. Our analysis yields interesting findings on what perturbations are challenging for state-of-the-art depth models, which we hope will inform further research. Code and data are available at https://github.com/princeton-vl/proc-depth-eval.

Related papers

BenchDepth: Are We on the Right Way to Evaluate Depth Foundation Models? [87.83483720539071]
Deep learning has led to powerful depth foundation models (DFMs)<n>Traditional benchmarks rely on alignment-based metrics that introduce biases, favor certain depth representations, and complicate fair comparisons.<n>We propose BenchDepth, a new benchmark that evaluates DFMs through five carefully selected downstream proxy tasks.
arXiv Detail & Related papers (2025-07-21T07:23:14Z)
Relative Pose Estimation through Affine Corrections of Monocular Depth Priors [69.59216331861437]
We develop three solvers for relative pose estimation that explicitly account for independent affine (scale and shift) ambiguities.<n>We propose a hybrid estimation pipeline that combines our proposed solvers with classic point-based solvers and epipolar constraints.
arXiv Detail & Related papers (2025-01-09T18:58:30Z)
A Simple yet Effective Test-Time Adaptation for Zero-Shot Monocular Metric Depth Estimation [46.037640130193566]
We propose a new method to rescale Depth Anything predictions using 3D points provided by sensors or techniques such as low-resolution LiDAR or structure-from-motion with poses given by an IMU.<n>Our experiments highlight enhancements relative to zero-shot monocular metric depth estimation methods, competitive results compared to fine-tuned approaches and a better robustness than depth completion approaches.
arXiv Detail & Related papers (2024-12-18T17:50:15Z)
RoboDepth: Robust Out-of-Distribution Depth Estimation under Corruptions [7.359657743276515]
We introduce a comprehensive robustness test suite, RoboDepth, spanning 18 corruptions spanning three categories. We benchmark 42 depth estimation models across indoor and outdoor scenes to assess their resilience to these corruptions. Our findings underscore that, in the absence of a dedicated robustness evaluation framework, many leading depth estimation models may be susceptible to typical corruptions.
arXiv Detail & Related papers (2023-10-23T17:59:59Z)
SC-DepthV3: Robust Self-supervised Monocular Depth Estimation for Dynamic Scenes [58.89295356901823]
Self-supervised monocular depth estimation has shown impressive results in static scenes. It relies on the multi-view consistency assumption for training networks, however, that is violated in dynamic object regions. We introduce an external pretrained monocular depth estimation model for generating single-image depth prior. Our model can predict sharp and accurate depth maps, even when training from monocular videos of highly-dynamic scenes.
arXiv Detail & Related papers (2022-11-07T16:17:47Z)
From 2D to 3D: Re-thinking Benchmarking of Monocular Depth Prediction [80.67873933010783]
We argue that MDP is currently witnessing benchmark over-fitting and relying on metrics that are only partially helpful to gauge the usefulness of the predictions for 3D applications. This limits the design and development of novel methods that are truly aware of - and improving towards estimating - the 3D structure of the scene rather than optimizing 2D-based distances. We propose a set of metrics well suited to evaluate the 3D geometry of MDP approaches and a novel indoor benchmark, RIO-D3D, crucial for the proposed evaluation methodology.
arXiv Detail & Related papers (2022-03-15T17:50:54Z)
Towards 3D Scene Reconstruction from Locally Scale-Aligned Monocular Video Depth [90.33296913575818]
In some video-based scenarios such as video depth estimation and 3D scene reconstruction from a video, the unknown scale and shift residing in per-frame prediction may cause the depth inconsistency. We propose a locally weighted linear regression method to recover the scale and shift with very sparse anchor points. Our method can boost the performance of existing state-of-the-art approaches by 50% at most over several zero-shot benchmarks.
arXiv Detail & Related papers (2022-02-03T08:52:54Z)
Variational Monocular Depth Estimation for Reliability Prediction [12.951621755732544]
Self-supervised learning for monocular depth estimation is widely investigated as an alternative to supervised learning approach. Previous works have successfully improved the accuracy of depth estimation by modifying the model structure. In this paper, we theoretically formulate a variational model for the monocular depth estimation to predict the reliability of the estimated depth image.
arXiv Detail & Related papers (2020-11-24T06:23:51Z)
RGB-D Salient Object Detection: A Survey [195.83586883670358]
We provide a comprehensive survey of RGB-D based SOD models from various perspectives. We also review SOD models and popular benchmark datasets from this domain. We discuss several challenges and open directions of RGB-D based SOD for future research.
arXiv Detail & Related papers (2020-08-01T10:01:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.