A Benchmark and a Baseline for Robust Multi-view Depth Estimation
- URL: http://arxiv.org/abs/2209.06681v1
- Date: Tue, 13 Sep 2022 17:44:16 GMT
- Title: A Benchmark and a Baseline for Robust Multi-view Depth Estimation
- Authors: Philipp Schr\"oppel and Jan Bechtold and Artemij Amiranashvili and
Thomas Brox
- Abstract summary: deep learning approaches for multi-view depth estimation are employed either in a depth-from-video or a multi-view stereo setting.
We introduce the Robust Multi-View Depth Benchmark that is built upon a set of public datasets.
We show that recent approaches do not generalize across datasets in this setting.
We present the Robust MVD Baseline model for multi-view depth estimation, which is built upon existing components but employs a novel scale augmentation procedure.
- Score: 36.02034260946296
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent deep learning approaches for multi-view depth estimation are employed
either in a depth-from-video or a multi-view stereo setting. Despite different
settings, these approaches are technically similar: they correlate multiple
source views with a keyview to estimate a depth map for the keyview. In this
work, we introduce the Robust Multi-View Depth Benchmark that is built upon a
set of public datasets and allows evaluation in both settings on data from
different domains. We evaluate recent approaches and find imbalanced
performances across domains. Further, we consider a third setting, where camera
poses are available and the objective is to estimate the corresponding depth
maps with their correct scale. We show that recent approaches do not generalize
across datasets in this setting. This is because their cost volume output runs
out of distribution. To resolve this, we present the Robust MVD Baseline model
for multi-view depth estimation, which is built upon existing components but
employs a novel scale augmentation procedure. It can be applied for robust
multi-view depth estimation, independent of the target data. We provide code
for the proposed benchmark and baseline model at
https://github.com/lmb-freiburg/robustmvd.
Related papers
- ScaleDepth: Decomposing Metric Depth Estimation into Scale Prediction and Relative Depth Estimation [62.600382533322325]
We propose a novel monocular depth estimation method called ScaleDepth.
Our method decomposes metric depth into scene scale and relative depth, and predicts them through a semantic-aware scale prediction module.
Our method achieves metric depth estimation for both indoor and outdoor scenes in a unified framework.
arXiv Detail & Related papers (2024-07-11T05:11:56Z) - ARAI-MVSNet: A multi-view stereo depth estimation network with adaptive
depth range and depth interval [19.28042366225802]
Multi-View Stereo(MVS) is a fundamental problem in geometric computer vision.
We present a novel multi-stage coarse-to-fine framework to achieve adaptive all-pixel depth range and depth interval.
Our model achieves state-of-the-art performance and yields competitive generalization ability.
arXiv Detail & Related papers (2023-08-17T14:52:11Z) - FS-Depth: Focal-and-Scale Depth Estimation from a Single Image in Unseen
Indoor Scene [57.26600120397529]
It has long been an ill-posed problem to predict absolute depth maps from single images in real (unseen) indoor scenes.
We develop a focal-and-scale depth estimation model to well learn absolute depth maps from single images in unseen indoor scenes.
arXiv Detail & Related papers (2023-07-27T04:49:36Z) - NVDS+: Towards Efficient and Versatile Neural Stabilizer for Video Depth Estimation [58.21817572577012]
Video depth estimation aims to infer temporally consistent depth.
We introduce NVDS+ that stabilizes inconsistent depth estimated by various single-image models in a plug-and-play manner.
We also elaborate a large-scale Video Depth in the Wild dataset, which contains 14,203 videos with over two million frames.
arXiv Detail & Related papers (2023-07-17T17:57:01Z) - Monocular Visual-Inertial Depth Estimation [66.71452943981558]
We present a visual-inertial depth estimation pipeline that integrates monocular depth estimation and visual-inertial odometry.
Our approach performs global scale and shift alignment against sparse metric depth, followed by learning-based dense alignment.
We evaluate on the TartanAir and VOID datasets, observing up to 30% reduction in RMSE with dense scale alignment.
arXiv Detail & Related papers (2023-03-21T18:47:34Z) - Multi-Camera Collaborative Depth Prediction via Consistent Structure
Estimation [75.99435808648784]
We propose a novel multi-camera collaborative depth prediction method.
It does not require large overlapping areas while maintaining structure consistency between cameras.
Experimental results on DDAD and NuScenes datasets demonstrate the superior performance of our method.
arXiv Detail & Related papers (2022-10-05T03:44:34Z) - Multi-View Depth Estimation by Fusing Single-View Depth Probability with
Multi-View Geometry [25.003116148843525]
We propose MaGNet, a framework for fusing single-view depth probability with multi-view geometry.
MaGNet achieves state-of-the-art performance on ScanNet, 7-Scenes and KITTI.
arXiv Detail & Related papers (2021-12-15T14:56:53Z) - Does it work outside this benchmark? Introducing the Rigid Depth
Constructor tool, depth validation dataset construction in rigid scenes for
the masses [1.294486861344922]
We present a protocol to construct your own depth validation dataset for navigation.
RDC for Rigid Depth Constructor aims at being more accessible and cheaper than already existing techniques.
We also develop a test suite to get insightful information from the evaluated algorithm.
arXiv Detail & Related papers (2021-03-29T22:01:24Z) - Attention Aware Cost Volume Pyramid Based Multi-view Stereo Network for
3D Reconstruction [12.728154351588053]
We present an efficient multi-view stereo (MVS) network for 3D reconstruction from multiview images.
We introduce a coarseto-fine depth inference strategy to achieve high resolution depth.
arXiv Detail & Related papers (2020-11-25T13:34:11Z) - Don't Forget The Past: Recurrent Depth Estimation from Monocular Video [92.84498980104424]
We put three different types of depth estimation into a common framework.
Our method produces a time series of depth maps.
It can be applied to monocular videos only or be combined with different types of sparse depth patterns.
arXiv Detail & Related papers (2020-01-08T16:50:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.