NDDepth: Normal-Distance Assisted Monocular Depth Estimation
- URL: http://arxiv.org/abs/2309.10592v2
- Date: Sun, 24 Sep 2023 14:30:04 GMT
- Title: NDDepth: Normal-Distance Assisted Monocular Depth Estimation
- Authors: Shuwei Shao, Zhongcai Pei, Weihai Chen, Xingming Wu and Zhengguo Li
- Abstract summary: We propose a novel physics (geometry)-driven deep learning framework for monocular depth estimation.
We introduce a new normal-distance head that outputs pixel-level surface normal and plane-to-origin distance for deriving depth at each position.
We develop an effective contrastive iterative refinement module that refines depth in a complementary manner according to the depth uncertainty.
- Score: 22.37113584192617
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Monocular depth estimation has drawn widespread attention from the vision
community due to its broad applications. In this paper, we propose a novel
physics (geometry)-driven deep learning framework for monocular depth
estimation by assuming that 3D scenes are constituted by piece-wise planes.
Particularly, we introduce a new normal-distance head that outputs pixel-level
surface normal and plane-to-origin distance for deriving depth at each
position. Meanwhile, the normal and distance are regularized by a developed
plane-aware consistency constraint. We further integrate an additional depth
head to improve the robustness of the proposed framework. To fully exploit the
strengths of these two heads, we develop an effective contrastive iterative
refinement module that refines depth in a complementary manner according to the
depth uncertainty. Extensive experiments indicate that the proposed method
exceeds previous state-of-the-art competitors on the NYU-Depth-v2, KITTI and
SUN RGB-D datasets. Notably, it ranks 1st among all submissions on the KITTI
depth prediction online benchmark at the submission time.
Related papers
- DepthSplat: Connecting Gaussian Splatting and Depth [90.06180236292866]
We present DepthSplat to connect Gaussian splatting and depth estimation.
We first contribute a robust multi-view depth model by leveraging pre-trained monocular depth features.
We also show that Gaussian splatting can serve as an unsupervised pre-training objective.
arXiv Detail & Related papers (2024-10-17T17:59:58Z) - DCPI-Depth: Explicitly Infusing Dense Correspondence Prior to Unsupervised Monocular Depth Estimation [17.99904937160487]
DCPI-Depth is a framework that incorporates all these innovative components and couples two bidirectional and collaborative streams.
It achieves state-of-the-art performance and generalizability across multiple public datasets, outperforming all existing prior arts.
arXiv Detail & Related papers (2024-05-27T08:55:17Z) - Self-Supervised Depth Completion Guided by 3D Perception and Geometry
Consistency [17.68427514090938]
This paper explores the utilization of 3D perceptual features and multi-view geometry consistency to devise a high-precision self-supervised depth completion method.
Experiments on benchmark datasets of NYU-Depthv2 and VOID demonstrate that the proposed model achieves the state-of-the-art depth completion performance.
arXiv Detail & Related papers (2023-12-23T14:19:56Z) - NDDepth: Normal-Distance Assisted Monocular Depth Estimation and
Completion [18.974297347310287]
We introduce novel physics (geometry)-driven deep learning frameworks for monocular depth estimation and completion.
Our method exceeds in performance prior state-of-the-art monocular depth estimation and completion competitors.
arXiv Detail & Related papers (2023-11-13T09:01:50Z) - GEDepth: Ground Embedding for Monocular Depth Estimation [4.95394574147086]
This paper proposes a novel ground embedding module to decouple camera parameters from pictorial cues.
A ground attention is designed in the module to optimally combine ground depth with residual depth.
Experiments reveal that our approach achieves the state-of-the-art results on popular benchmarks.
arXiv Detail & Related papers (2023-09-18T17:56:06Z) - P3Depth: Monocular Depth Estimation with a Piecewise Planarity Prior [133.76192155312182]
We propose a method that learns to selectively leverage information from coplanar pixels to improve the predicted depth.
An extensive evaluation of our method shows that we set the new state of the art in supervised monocular depth estimation.
arXiv Detail & Related papers (2022-04-05T10:03:52Z) - Learning Geometry-Guided Depth via Projective Modeling for Monocular 3D Object Detection [70.71934539556916]
We learn geometry-guided depth estimation with projective modeling to advance monocular 3D object detection.
Specifically, a principled geometry formula with projective modeling of 2D and 3D depth predictions in the monocular 3D object detection network is devised.
Our method remarkably improves the detection performance of the state-of-the-art monocular-based method without extra data by 2.80% on the moderate test setting.
arXiv Detail & Related papers (2021-07-29T12:30:39Z) - Deep Two-View Structure-from-Motion Revisited [83.93809929963969]
Two-view structure-from-motion (SfM) is the cornerstone of 3D reconstruction and visual SLAM.
We propose to revisit the problem of deep two-view SfM by leveraging the well-posedness of the classic pipeline.
Our method consists of 1) an optical flow estimation network that predicts dense correspondences between two frames; 2) a normalized pose estimation module that computes relative camera poses from the 2D optical flow correspondences, and 3) a scale-invariant depth estimation network that leverages epipolar geometry to reduce the search space, refine the dense correspondences, and estimate relative depth maps.
arXiv Detail & Related papers (2021-04-01T15:31:20Z) - Virtual Normal: Enforcing Geometric Constraints for Accurate and Robust
Depth Prediction [87.08227378010874]
We show the importance of the high-order 3D geometric constraints for depth prediction.
By designing a loss term that enforces a simple geometric constraint, we significantly improve the accuracy and robustness of monocular depth estimation.
We show state-of-the-art results of learning metric depth on NYU Depth-V2 and KITTI.
arXiv Detail & Related papers (2021-03-07T00:08:21Z) - Pseudo RGB-D for Self-Improving Monocular SLAM and Depth Prediction [72.30870535815258]
CNNs for monocular depth prediction represent two largely disjoint approaches towards building a 3D map of the surrounding environment.
We propose a joint narrow and wide baseline based self-improving framework, where on the one hand the CNN-predicted depth is leveraged to perform pseudo RGB-D feature-based SLAM.
On the other hand, the bundle-adjusted 3D scene structures and camera poses from the more principled geometric SLAM are injected back into the depth network through novel wide baseline losses.
arXiv Detail & Related papers (2020-04-22T16:31:59Z) - Guiding Monocular Depth Estimation Using Depth-Attention Volume [38.92495189498365]
We propose guiding depth estimation to favor planar structures that are ubiquitous especially in indoor environments.
Experiments on two popular indoor datasets, NYU-Depth-v2 and ScanNet, show that our method achieves state-of-the-art depth estimation results.
arXiv Detail & Related papers (2020-04-06T15:45:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.