Learning Inverse Depth Regression for Multi-View Stereo with Correlation
Cost Volume
- URL: http://arxiv.org/abs/1912.11746v1
- Date: Thu, 26 Dec 2019 01:40:44 GMT
- Title: Learning Inverse Depth Regression for Multi-View Stereo with Correlation
Cost Volume
- Authors: Qingshan Xu and Wenbing Tao
- Abstract summary: Deep learning has shown to be effective for depth inference in multi-view stereo (MVS)
However, the scalability and accuracy still remain an open problem in this domain.
Inspired by the group-wise correlation in stereo matching, we propose an average group-wise correlation similarity measure to construct a lightweight cost volume.
- Score: 32.41293572426403
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning has shown to be effective for depth inference in multi-view
stereo (MVS). However, the scalability and accuracy still remain an open
problem in this domain. This can be attributed to the memory-consuming cost
volume representation and inappropriate depth inference. Inspired by the
group-wise correlation in stereo matching, we propose an average group-wise
correlation similarity measure to construct a lightweight cost volume. This can
not only reduce the memory consumption but also reduce the computational burden
in the cost volume filtering. Based on our effective cost volume
representation, we propose a cascade 3D U-Net module to regularize the cost
volume to further boost the performance. Unlike the previous methods that treat
multi-view depth inference as a depth regression problem or an inverse depth
classification problem, we recast multi-view depth inference as an inverse
depth regression task. This allows our network to achieve sub-pixel estimation
and be applicable to large-scale scenes. Through extensive experiments on DTU
dataset and Tanks and Temples dataset, we show that our proposed network with
Correlation cost volume and Inverse DEpth Regression (CIDER), achieves
state-of-the-art results, demonstrating its superior performance on scalability
and accuracy.
Related papers
- NeRF-Det++: Incorporating Semantic Cues and Perspective-aware Depth
Supervision for Indoor Multi-View 3D Detection [72.0098999512727]
NeRF-Det has achieved impressive performance in indoor multi-view 3D detection by utilizing NeRF to enhance representation learning.
We present three corresponding solutions, including semantic enhancement, perspective-aware sampling, and ordinal depth supervision.
The resulting algorithm, NeRF-Det++, has exhibited appealing performance in the ScanNetV2 and AR KITScenes datasets.
arXiv Detail & Related papers (2024-02-22T11:48:06Z) - AugUndo: Scaling Up Augmentations for Monocular Depth Completion and Estimation [51.143540967290114]
We propose a method that unlocks a wide range of previously-infeasible geometric augmentations for unsupervised depth computation and estimation.
This is achieved by reversing, or undo''-ing, geometric transformations to the coordinates of the output depth, warping the depth map back to the original reference frame.
arXiv Detail & Related papers (2023-10-15T05:15:45Z) - Non-parametric Depth Distribution Modelling based Depth Inference for
Multi-view Stereo [43.415242967722804]
Recent cost volume pyramid based deep neural networks have unlocked the potential of efficiently leveraging high-resolution images for depth inference from multi-view stereo.
In general, those approaches assume that the depth of each pixel follows a unimodal distribution.
We propose constructing the cost volume by non-parametric depth distribution modeling to handle pixels with unimodal and multi-modal distributions.
arXiv Detail & Related papers (2022-05-08T05:13:04Z) - Depth Refinement for Improved Stereo Reconstruction [13.941756438712382]
Current techniques for depth estimation from stereoscopic images still suffer from a built-in drawback.
A simple analysis reveals that the depth error is quadratically proportional to the object's distance.
We propose a simple but effective method that uses a refinement network for depth estimation.
arXiv Detail & Related papers (2021-12-15T12:21:08Z) - Multi-View Stereo Network with attention thin volume [0.0]
We propose an efficient multi-view stereo (MVS) network for infering depth value from multiple RGB images.
We introduce the self-attention mechanism to fully aggregate the dominant information from input images.
We also introduce the group-wise correlation to feature aggregation, which greatly reduces the memory and calculation burden.
arXiv Detail & Related papers (2021-10-16T11:51:23Z) - Non-local Recurrent Regularization Networks for Multi-view Stereo [108.17325696835542]
In deep multi-view stereo networks, cost regularization is crucial to achieve accurate depth estimation.
We propose a novel non-local recurrent regularization network for multi-view stereo, named NR2-Net.
Our method achieves state-of-the-art reconstruction results on both DTU and Tanks and Temples datasets.
arXiv Detail & Related papers (2021-10-13T01:43:54Z) - Correlate-and-Excite: Real-Time Stereo Matching via Guided Cost Volume
Excitation [65.83008812026635]
We construct Guided Cost volume Excitation (GCE) and show that simple channel excitation of cost volume guided by image can improve performance considerably.
We present an end-to-end network that we call Correlate-and-Excite (CoEx)
arXiv Detail & Related papers (2021-08-12T14:32:26Z) - CFNet: Cascade and Fused Cost Volume for Robust Stereo Matching [27.313740022587442]
We propose CFNet, a Cascade and Fused cost volume based network to improve the robustness of the stereo matching network.
We employ a variance-based uncertainty estimation to adaptively adjust the next stage disparity search space.
Our proposed method achieves the state-of-the-art overall performance and obtains the 1st place on the stereo task of Robust Vision Challenge 2020.
arXiv Detail & Related papers (2021-04-09T11:38:59Z) - Learning Optical Flow from a Few Matches [67.83633948984954]
We show that the dense correlation volume representation is redundant and accurate flow estimation can be achieved with only a fraction of elements in it.
Experiments show that our method can reduce computational cost and memory use significantly, while maintaining high accuracy.
arXiv Detail & Related papers (2021-04-05T21:44:00Z) - Direct Depth Learning Network for Stereo Matching [79.3665881702387]
A novel Direct Depth Learning Network (DDL-Net) is designed for stereo matching.
DDL-Net consists of two stages: the Coarse Depth Estimation stage and the Adaptive-Grained Depth Refinement stage.
We show that DDL-Net achieves an average improvement of 25% on the SceneFlow dataset and $12%$ on the DrivingStereo dataset.
arXiv Detail & Related papers (2020-12-10T10:33:57Z) - Attention Aware Cost Volume Pyramid Based Multi-view Stereo Network for
3D Reconstruction [12.728154351588053]
We present an efficient multi-view stereo (MVS) network for 3D reconstruction from multiview images.
We introduce a coarseto-fine depth inference strategy to achieve high resolution depth.
arXiv Detail & Related papers (2020-11-25T13:34:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.