Non-local Recurrent Regularization Networks for Multi-view Stereo
- URL: http://arxiv.org/abs/2110.06436v1
- Date: Wed, 13 Oct 2021 01:43:54 GMT
- Title: Non-local Recurrent Regularization Networks for Multi-view Stereo
- Authors: Qingshan Xu, Martin R. Oswald, Wenbing Tao, Marc Pollefeys, Zhaopeng
Cui
- Abstract summary: In deep multi-view stereo networks, cost regularization is crucial to achieve accurate depth estimation.
We propose a novel non-local recurrent regularization network for multi-view stereo, named NR2-Net.
Our method achieves state-of-the-art reconstruction results on both DTU and Tanks and Temples datasets.
- Score: 108.17325696835542
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In deep multi-view stereo networks, cost regularization is crucial to achieve
accurate depth estimation. Since 3D cost volume filtering is usually
memory-consuming, recurrent 2D cost map regularization has recently become
popular and has shown great potential in reconstructing 3D models of different
scales. However, existing recurrent methods only model the local dependencies
in the depth domain, which greatly limits the capability of capturing the
global scene context along the depth dimension. To tackle this limitation, we
propose a novel non-local recurrent regularization network for multi-view
stereo, named NR2-Net. Specifically, we design a depth attention module to
capture non-local depth interactions within a sliding depth block. Then, the
global scene context between different blocks is modeled in a gated recurrent
manner. This way, the long-range dependencies along the depth dimension are
captured to facilitate the cost regularization. Moreover, we design a dynamic
depth map fusion strategy to improve the algorithm robustness. Our method
achieves state-of-the-art reconstruction results on both DTU and Tanks and
Temples datasets.
Related papers
- Scale Propagation Network for Generalizable Depth Completion [16.733495588009184]
We propose a novel scale propagation normalization (SP-Norm) method to propagate scales from input to output.
We also develop a new network architecture based on SP-Norm and the ConvNeXt V2 backbone.
Our model consistently achieves the best accuracy with faster speed and lower memory when compared to state-of-the-art methods.
arXiv Detail & Related papers (2024-10-24T03:53:06Z) - 3DVNet: Multi-View Depth Prediction and Volumetric Refinement [68.68537312256144]
3DVNet is a novel multi-view stereo (MVS) depth-prediction method.
Our key idea is the use of a 3D scene-modeling network that iteratively updates a set of coarse depth predictions.
We show that our method exceeds state-of-the-art accuracy in both depth prediction and 3D reconstruction metrics.
arXiv Detail & Related papers (2021-12-01T00:52:42Z) - VolumeFusion: Deep Depth Fusion for 3D Scene Reconstruction [71.83308989022635]
In this paper, we advocate that replicating the traditional two stages framework with deep neural networks improves both the interpretability and the accuracy of the results.
Our network operates in two steps: 1) the local computation of the local depth maps with a deep MVS technique, and, 2) the depth maps and images' features fusion to build a single TSDF volume.
In order to improve the matching performance between images acquired from very different viewpoints, we introduce a rotation-invariant 3D convolution kernel called PosedConv.
arXiv Detail & Related papers (2021-08-19T11:33:58Z) - SRH-Net: Stacked Recurrent Hourglass Network for Stereo Matching [33.66537830990198]
We decouple the 4D cubic cost volume used by 3D convolutional filters into sequential cost maps along the direction of disparity.
A novel recurrent module, Stacked Recurrent Hourglass (SRH), is proposed to process each cost map.
The proposed architecture is implemented in an end-to-end pipeline and evaluated on public datasets.
arXiv Detail & Related papers (2021-05-25T00:10:56Z) - Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks [87.50632573601283]
We present a novel method for multi-view depth estimation from a single video.
Our method achieves temporally coherent depth estimation results by using a novel Epipolar Spatio-Temporal (EST) transformer.
To reduce the computational cost, inspired by recent Mixture-of-Experts models, we design a compact hybrid network.
arXiv Detail & Related papers (2020-11-26T04:04:21Z) - Attention Aware Cost Volume Pyramid Based Multi-view Stereo Network for
3D Reconstruction [12.728154351588053]
We present an efficient multi-view stereo (MVS) network for 3D reconstruction from multiview images.
We introduce a coarseto-fine depth inference strategy to achieve high resolution depth.
arXiv Detail & Related papers (2020-11-25T13:34:11Z) - Dense Hybrid Recurrent Multi-view Stereo Net with Dynamic Consistency
Checking [54.58791377183574]
Our novel hybrid recurrent multi-view stereo net consists of two core modules: 1) a light DRENet (Dense Reception Expanded) module to extract dense feature maps of original size with multi-scale context information, 2) a HU-LSTM (Hybrid U-LSTM) to regularize 3D matching volume into predicted depth map.
Our method exhibits competitive performance to the state-of-the-art method while dramatically reduces memory consumption, which costs only $19.4%$ of R-MVSNet memory consumption.
arXiv Detail & Related papers (2020-07-21T14:59:59Z) - OmniSLAM: Omnidirectional Localization and Dense Mapping for
Wide-baseline Multi-camera Systems [88.41004332322788]
We present an omnidirectional localization and dense mapping system for a wide-baseline multiview stereo setup with ultra-wide field-of-view (FOV) fisheye cameras.
For more practical and accurate reconstruction, we first introduce improved and light-weighted deep neural networks for the omnidirectional depth estimation.
We integrate our omnidirectional depth estimates into the visual odometry (VO) and add a loop closing module for global consistency.
arXiv Detail & Related papers (2020-03-18T05:52:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.