DDR-Net: Learning Multi-Stage Multi-View Stereo With Dynamic Depth Range
- URL: http://arxiv.org/abs/2103.14275v1
- Date: Fri, 26 Mar 2021 05:52:38 GMT
- Title: DDR-Net: Learning Multi-Stage Multi-View Stereo With Dynamic Depth Range
- Authors: Puyuan Yi, Shengkun Tang and Jian Yao
- Abstract summary: We propose a Dynamic Depth Range Network ( DDR-Net) to determine the depth range hypotheses dynamically.
In our DDR-Net, we first build an initial depth map at the coarsest resolution of an image across the entire depth range.
We develop a novel loss strategy, which utilizes learned dynamic depth ranges to generate refined depth maps.
- Score: 2.081393321765571
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: To obtain high-resolution depth maps, some previous learning-based multi-view
stereo methods build a cost volume pyramid in a coarse-to-fine manner. These
approaches leverage fixed depth range hypotheses to construct cascaded plane
sweep volumes. However, it is inappropriate to set identical range hypotheses
for each pixel since the uncertainties of previous per-pixel depth predictions
are spatially varying. Distinct from these approaches, we propose a Dynamic
Depth Range Network (DDR-Net) to determine the depth range hypotheses
dynamically by applying a range estimation module (REM) to learn the
uncertainties of range hypotheses in the former stages. Specifically, in our
DDR-Net, we first build an initial depth map at the coarsest resolution of an
image across the entire depth range. Then the range estimation module (REM)
leverages the probability distribution information of the initial depth to
estimate the depth range hypotheses dynamically for the following stages.
Moreover, we develop a novel loss strategy, which utilizes learned dynamic
depth ranges to generate refined depth maps, to keep the ground truth value of
each pixel covered in the range hypotheses of the next stage. Extensive
experimental results show that our method achieves superior performance over
other state-of-the-art methods on the DTU benchmark and obtains comparable
results on the Tanks and Temples benchmark. The code is available at
https://github.com/Tangshengku/DDR-Net.
Related papers
- ScaleDepth: Decomposing Metric Depth Estimation into Scale Prediction and Relative Depth Estimation [62.600382533322325]
We propose a novel monocular depth estimation method called ScaleDepth.
Our method decomposes metric depth into scene scale and relative depth, and predicts them through a semantic-aware scale prediction module.
Our method achieves metric depth estimation for both indoor and outdoor scenes in a unified framework.
arXiv Detail & Related papers (2024-07-11T05:11:56Z) - ARAI-MVSNet: A multi-view stereo depth estimation network with adaptive
depth range and depth interval [19.28042366225802]
Multi-View Stereo(MVS) is a fundamental problem in geometric computer vision.
We present a novel multi-stage coarse-to-fine framework to achieve adaptive all-pixel depth range and depth interval.
Our model achieves state-of-the-art performance and yields competitive generalization ability.
arXiv Detail & Related papers (2023-08-17T14:52:11Z) - Single Image Depth Prediction Made Better: A Multivariate Gaussian Take [163.14849753700682]
We introduce an approach that performs continuous modeling of per-pixel depth.
Our method's accuracy (named MG) is among the top on the KITTI depth-prediction benchmark leaderboard.
arXiv Detail & Related papers (2023-03-31T16:01:03Z) - Towards Domain-agnostic Depth Completion [28.25756709062647]
Existing depth completion methods are often targeted at a specific sparse depth type and generalize poorly across task domains.
We present a method to complete sparse/semi-dense, noisy, and potentially low-resolution depth maps obtained by various range sensors.
Our method shows superior cross-domain generalization ability against state-of-the-art depth completion methods.
arXiv Detail & Related papers (2022-07-29T04:10:22Z) - Non-parametric Depth Distribution Modelling based Depth Inference for
Multi-view Stereo [43.415242967722804]
Recent cost volume pyramid based deep neural networks have unlocked the potential of efficiently leveraging high-resolution images for depth inference from multi-view stereo.
In general, those approaches assume that the depth of each pixel follows a unimodal distribution.
We propose constructing the cost volume by non-parametric depth distribution modeling to handle pixels with unimodal and multi-modal distributions.
arXiv Detail & Related papers (2022-05-08T05:13:04Z) - 3DVNet: Multi-View Depth Prediction and Volumetric Refinement [68.68537312256144]
3DVNet is a novel multi-view stereo (MVS) depth-prediction method.
Our key idea is the use of a 3D scene-modeling network that iteratively updates a set of coarse depth predictions.
We show that our method exceeds state-of-the-art accuracy in both depth prediction and 3D reconstruction metrics.
arXiv Detail & Related papers (2021-12-01T00:52:42Z) - IB-MVS: An Iterative Algorithm for Deep Multi-View Stereo based on
Binary Decisions [0.0]
We present a novel deep-learning-based method for Multi-View Stereo.
Our method estimates high resolution and highly precise depth maps iteratively, by traversing the continuous space of feasible depth values at each pixel in a binary decision fashion.
We compare our method with state-of-the-art Multi-View Stereo methods on the DTU, Tanks and Temples and the challenging ETH3D benchmarks and show competitive results.
arXiv Detail & Related papers (2021-11-29T10:04:24Z) - Virtual Normal: Enforcing Geometric Constraints for Accurate and Robust
Depth Prediction [87.08227378010874]
We show the importance of the high-order 3D geometric constraints for depth prediction.
By designing a loss term that enforces a simple geometric constraint, we significantly improve the accuracy and robustness of monocular depth estimation.
We show state-of-the-art results of learning metric depth on NYU Depth-V2 and KITTI.
arXiv Detail & Related papers (2021-03-07T00:08:21Z) - Attention Aware Cost Volume Pyramid Based Multi-view Stereo Network for
3D Reconstruction [12.728154351588053]
We present an efficient multi-view stereo (MVS) network for 3D reconstruction from multiview images.
We introduce a coarseto-fine depth inference strategy to achieve high resolution depth.
arXiv Detail & Related papers (2020-11-25T13:34:11Z) - DiverseDepth: Affine-invariant Depth Prediction Using Diverse Data [110.29043712400912]
We present a method for depth estimation with monocular images, which can predict high-quality depth on diverse scenes up to an affine transformation.
Experiments show that our method outperforms previous methods on 8 datasets by a large margin with the zero-shot test setting.
arXiv Detail & Related papers (2020-02-03T05:38:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.