Monocular Visual-Inertial Depth Estimation
- URL: http://arxiv.org/abs/2303.12134v1
- Date: Tue, 21 Mar 2023 18:47:34 GMT
- Title: Monocular Visual-Inertial Depth Estimation
- Authors: Diana Wofk, Ren\'e Ranftl, Matthias M\"uller, and Vladlen Koltun
- Abstract summary: We present a visual-inertial depth estimation pipeline that integrates monocular depth estimation and visual-inertial odometry.
Our approach performs global scale and shift alignment against sparse metric depth, followed by learning-based dense alignment.
We evaluate on the TartanAir and VOID datasets, observing up to 30% reduction in RMSE with dense scale alignment.
- Score: 66.71452943981558
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a visual-inertial depth estimation pipeline that integrates
monocular depth estimation and visual-inertial odometry to produce dense depth
estimates with metric scale. Our approach performs global scale and shift
alignment against sparse metric depth, followed by learning-based dense
alignment. We evaluate on the TartanAir and VOID datasets, observing up to 30%
reduction in inverse RMSE with dense scale alignment relative to performing
just global alignment alone. Our approach is especially competitive at low
density; with just 150 sparse metric depth points, our dense-to-dense depth
alignment method achieves over 50% lower iRMSE over sparse-to-dense depth
completion by KBNet, currently the state of the art on VOID. We demonstrate
successful zero-shot transfer from synthetic TartanAir to real-world VOID data
and perform generalization tests on NYUv2 and VCU-RVI. Our approach is modular
and is compatible with a variety of monocular depth estimation models. Video:
https://youtu.be/IMwiKwSpshQ Code: https://github.com/isl-org/VI-Depth
Related papers
- MetricGold: Leveraging Text-To-Image Latent Diffusion Models for Metric Depth Estimation [9.639797094021988]
MetricGold is a novel approach that harnesses generative diffusion model's rich priors to improve metric depth estimation.
Our experiments demonstrate robust generalization across diverse datasets, producing sharper and higher quality metric depth estimates.
arXiv Detail & Related papers (2024-11-16T20:59:01Z) - ScaleDepth: Decomposing Metric Depth Estimation into Scale Prediction and Relative Depth Estimation [62.600382533322325]
We propose a novel monocular depth estimation method called ScaleDepth.
Our method decomposes metric depth into scene scale and relative depth, and predicts them through a semantic-aware scale prediction module.
Our method achieves metric depth estimation for both indoor and outdoor scenes in a unified framework.
arXiv Detail & Related papers (2024-07-11T05:11:56Z) - Depth-aware Volume Attention for Texture-less Stereo Matching [67.46404479356896]
We propose a lightweight volume refinement scheme to tackle the texture deterioration in practical outdoor scenarios.
We introduce a depth volume supervised by the ground-truth depth map, capturing the relative hierarchy of image texture.
Local fine structure and context are emphasized to mitigate ambiguity and redundancy during volume aggregation.
arXiv Detail & Related papers (2024-02-14T04:07:44Z) - Metrically Scaled Monocular Depth Estimation through Sparse Priors for
Underwater Robots [0.0]
We formulate a deep learning model that fuses sparse depth measurements from triangulated features to improve the depth predictions.
The network is trained in a supervised fashion on the forward-looking underwater dataset, FLSea.
The method achieves real-time performance, running at 160 FPS on a laptop GPU and 7 FPS on a single CPU core.
arXiv Detail & Related papers (2023-10-25T16:32:31Z) - DesNet: Decomposed Scale-Consistent Network for Unsupervised Depth
Completion [28.91716162403531]
Unsupervised depth completion aims to recover dense depth from the sparse one without using the ground-truth annotation.
We propose the scale-consistent learning (DSCL) strategy, which disintegrates the absolute depth into relative depth prediction and global scale estimation.
Our approach achieves state-of-the-art performance on indoor NYUv2 dataset.
arXiv Detail & Related papers (2022-11-20T14:56:18Z) - Improving Monocular Visual Odometry Using Learned Depth [84.05081552443693]
We propose a framework to exploit monocular depth estimation for improving visual odometry (VO)
The core of our framework is a monocular depth estimation module with a strong generalization capability for diverse scenes.
Compared with current learning-based VO methods, our method demonstrates a stronger generalization ability to diverse scenes.
arXiv Detail & Related papers (2022-04-04T06:26:46Z) - SelfTune: Metrically Scaled Monocular Depth Estimation through
Self-Supervised Learning [53.78813049373321]
We propose a self-supervised learning method for the pre-trained supervised monocular depth networks to enable metrically scaled depth estimation.
Our approach is useful for various applications such as mobile robot navigation and is applicable to diverse environments.
arXiv Detail & Related papers (2022-03-10T12:28:42Z) - CodeVIO: Visual-Inertial Odometry with Learned Optimizable Dense Depth [83.77839773394106]
We present a lightweight, tightly-coupled deep depth network and visual-inertial odometry system.
We provide the network with previously marginalized sparse features from VIO to increase the accuracy of initial depth prediction.
We show that it can run in real-time with single-thread execution while utilizing GPU acceleration only for the network and code Jacobian.
arXiv Detail & Related papers (2020-12-18T09:42:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.