Multi-View Depth Estimation by Fusing Single-View Depth Probability with
Multi-View Geometry
- URL: http://arxiv.org/abs/2112.08177v1
- Date: Wed, 15 Dec 2021 14:56:53 GMT
- Title: Multi-View Depth Estimation by Fusing Single-View Depth Probability with
Multi-View Geometry
- Authors: Gwangbin Bae, Ignas Budvytis, Roberto Cipolla
- Abstract summary: We propose MaGNet, a framework for fusing single-view depth probability with multi-view geometry.
MaGNet achieves state-of-the-art performance on ScanNet, 7-Scenes and KITTI.
- Score: 25.003116148843525
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-view depth estimation methods typically require the computation of a
multi-view cost-volume, which leads to huge memory consumption and slow
inference. Furthermore, multi-view matching can fail for texture-less surfaces,
reflective surfaces and moving objects. For such failure modes, single-view
depth estimation methods are often more reliable. To this end, we propose
MaGNet, a novel framework for fusing single-view depth probability with
multi-view geometry, to improve the accuracy, robustness and efficiency of
multi-view depth estimation. For each frame, MaGNet estimates a single-view
depth probability distribution, parameterized as a pixel-wise Gaussian. The
distribution estimated for the reference frame is then used to sample per-pixel
depth candidates. Such probabilistic sampling enables the network to achieve
higher accuracy while evaluating fewer depth candidates. We also propose depth
consistency weighting for the multi-view matching score, to ensure that the
multi-view depth is consistent with the single-view predictions. The proposed
method achieves state-of-the-art performance on ScanNet, 7-Scenes and KITTI.
Qualitative evaluation demonstrates that our method is more robust against
challenging artifacts such as texture-less/reflective surfaces and moving
objects.
Related papers
- Adaptive Fusion of Single-View and Multi-View Depth for Autonomous
Driving [22.58849429006898]
Current multi-view depth estimation methods or single-view and multi-view fusion methods will fail when given noisy pose settings.
We propose a single-view and multi-view fused depth estimation system, which adaptively integrates high-confident multi-view and single-view results.
Our method outperforms state-of-the-art multi-view and fusion methods under robustness testing.
arXiv Detail & Related papers (2024-03-12T11:18:35Z) - Unveiling the Depths: A Multi-Modal Fusion Framework for Challenging
Scenarios [103.72094710263656]
This paper presents a novel approach that identifies and integrates dominant cross-modality depth features with a learning-based framework.
We propose a novel confidence loss steering a confidence predictor network to yield a confidence map specifying latent potential depth areas.
With the resulting confidence map, we propose a multi-modal fusion network that fuses the final depth in an end-to-end manner.
arXiv Detail & Related papers (2024-02-19T04:39:16Z) - Single Image Depth Prediction Made Better: A Multivariate Gaussian Take [163.14849753700682]
We introduce an approach that performs continuous modeling of per-pixel depth.
Our method's accuracy (named MG) is among the top on the KITTI depth-prediction benchmark leaderboard.
arXiv Detail & Related papers (2023-03-31T16:01:03Z) - Improving Monocular Visual Odometry Using Learned Depth [84.05081552443693]
We propose a framework to exploit monocular depth estimation for improving visual odometry (VO)
The core of our framework is a monocular depth estimation module with a strong generalization capability for diverse scenes.
Compared with current learning-based VO methods, our method demonstrates a stronger generalization ability to diverse scenes.
arXiv Detail & Related papers (2022-04-04T06:26:46Z) - Uncertainty-Aware Deep Multi-View Photometric Stereo [100.97116470055273]
Photometric stereo (PS) is excellent at recovering high-frequency surface details, whereas multi-view stereo (MVS) can help remove the low-frequency distortion due to PS and retain the global shape.
This paper proposes an approach that can effectively utilize such complementary strengths of PS and MVS.
We estimate per-pixel surface normals and depth using an uncertainty-aware deep-PS network and deep-MVS network, respectively.
arXiv Detail & Related papers (2022-02-26T05:45:52Z) - A Confidence-based Iterative Solver of Depths and Surface Normals for
Deep Multi-view Stereo [41.527018997251744]
We introduce a deep multi-view stereo (MVS) system that jointly predicts depths, surface normals and per-view confidence maps.
The key to our approach is a novel solver that iteratively solves for per-view depth map and normal map.
Our proposed solver consistently improves the depth quality over both conventional and deep learning based MVS pipelines.
arXiv Detail & Related papers (2022-01-19T14:08:45Z) - Probabilistic and Geometric Depth: Detecting Objects in Perspective [78.00922683083776]
3D object detection is an important capability needed in various practical applications such as driver assistance systems.
Monocular 3D detection, as an economical solution compared to conventional settings relying on binocular vision or LiDAR, has drawn increasing attention recently but still yields unsatisfactory results.
This paper first presents a systematic study on this problem and observes that the current monocular 3D detection problem can be simplified as an instance depth estimation problem.
arXiv Detail & Related papers (2021-07-29T16:30:33Z) - Monocular Depth Parameterizing Networks [15.791732557395552]
We propose a network structure that provides a parameterization of a set of depth maps with feasible shapes.
This allows us to search the shapes for a photo consistent solution with respect to other images.
Our experimental evaluation shows that our method generates more accurate depth maps and generalizes better than competing state-of-the-art approaches.
arXiv Detail & Related papers (2020-12-21T13:02:41Z) - Variational Monocular Depth Estimation for Reliability Prediction [12.951621755732544]
Self-supervised learning for monocular depth estimation is widely investigated as an alternative to supervised learning approach.
Previous works have successfully improved the accuracy of depth estimation by modifying the model structure.
In this paper, we theoretically formulate a variational model for the monocular depth estimation to predict the reliability of the estimated depth image.
arXiv Detail & Related papers (2020-11-24T06:23:51Z) - Occlusion-Aware Depth Estimation with Adaptive Normal Constraints [85.44842683936471]
We present a new learning-based method for multi-frame depth estimation from a color video.
Our method outperforms the state-of-the-art in terms of depth estimation accuracy.
arXiv Detail & Related papers (2020-04-02T07:10:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.