Improving Monocular Visual Odometry Using Learned Depth
- URL: http://arxiv.org/abs/2204.01268v1
- Date: Mon, 4 Apr 2022 06:26:46 GMT
- Title: Improving Monocular Visual Odometry Using Learned Depth
- Authors: Libo Sun, Wei Yin, Enze Xie, Zhengrong Li, Changming Sun, Chunhua Shen
- Abstract summary: We propose a framework to exploit monocular depth estimation for improving visual odometry (VO)
The core of our framework is a monocular depth estimation module with a strong generalization capability for diverse scenes.
Compared with current learning-based VO methods, our method demonstrates a stronger generalization ability to diverse scenes.
- Score: 84.05081552443693
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Monocular visual odometry (VO) is an important task in robotics and computer
vision. Thus far, how to build accurate and robust monocular VO systems that
can work well in diverse scenarios remains largely unsolved. In this paper, we
propose a framework to exploit monocular depth estimation for improving VO. The
core of our framework is a monocular depth estimation module with a strong
generalization capability for diverse scenes. It consists of two separate
working modes to assist the localization and mapping. With a single monocular
image input, the depth estimation module predicts a relative depth to help the
localization module on improving the accuracy. With a sparse depth map and an
RGB image input, the depth estimation module can generate accurate
scale-consistent depth for dense mapping. Compared with current learning-based
VO methods, our method demonstrates a stronger generalization ability to
diverse scenes. More significantly, our framework is able to boost the
performances of existing geometry-based VO methods by a large margin.
Related papers
- ScaleDepth: Decomposing Metric Depth Estimation into Scale Prediction and Relative Depth Estimation [62.600382533322325]
We propose a novel monocular depth estimation method called ScaleDepth.
Our method decomposes metric depth into scene scale and relative depth, and predicts them through a semantic-aware scale prediction module.
Our method achieves metric depth estimation for both indoor and outdoor scenes in a unified framework.
arXiv Detail & Related papers (2024-07-11T05:11:56Z) - Self-Supervised Learning based Depth Estimation from Monocular Images [0.0]
The goal of Monocular Depth Estimation is to predict the depth map, given a 2D monocular RGB image as input.
We plan to do intrinsic camera parameters during training and apply weather augmentations to further generalize our model.
arXiv Detail & Related papers (2023-04-14T07:14:08Z) - Sparse Depth-Guided Attention for Accurate Depth Completion: A
Stereo-Assisted Monitored Distillation Approach [7.902840502973506]
We introduce a stereo-based model as a teacher model to improve the accuracy of the student model for depth completion.
To provide self-supervised information, we also employ multi-view depth consistency and multi-scale minimum reprojection.
arXiv Detail & Related papers (2023-03-28T09:23:19Z) - Monocular Visual-Inertial Depth Estimation [66.71452943981558]
We present a visual-inertial depth estimation pipeline that integrates monocular depth estimation and visual-inertial odometry.
Our approach performs global scale and shift alignment against sparse metric depth, followed by learning-based dense alignment.
We evaluate on the TartanAir and VOID datasets, observing up to 30% reduction in RMSE with dense scale alignment.
arXiv Detail & Related papers (2023-03-21T18:47:34Z) - Crafting Monocular Cues and Velocity Guidance for Self-Supervised
Multi-Frame Depth Learning [22.828829870704006]
Self-supervised monocular methods can efficiently learn depth information of weakly textured surfaces or reflective objects.
In contrast, multi-frame depth estimation methods improve the depth accuracy thanks to the success of Multi-View Stereo.
We propose MOVEDepth, which exploits the MOnocular cues and VE guidance to improve multi-frame depth learning.
arXiv Detail & Related papers (2022-08-19T06:32:06Z) - SelfTune: Metrically Scaled Monocular Depth Estimation through
Self-Supervised Learning [53.78813049373321]
We propose a self-supervised learning method for the pre-trained supervised monocular depth networks to enable metrically scaled depth estimation.
Our approach is useful for various applications such as mobile robot navigation and is applicable to diverse environments.
arXiv Detail & Related papers (2022-03-10T12:28:42Z) - TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view
Stereo [55.30992853477754]
We present TANDEM, a real-time monocular tracking and dense framework.
For pose estimation, TANDEM performs photometric bundle adjustment based on a sliding window of alignments.
TANDEM shows state-of-the-art real-time 3D reconstruction performance.
arXiv Detail & Related papers (2021-11-14T19:01:02Z) - Aug3D-RPN: Improving Monocular 3D Object Detection by Synthetic Images
with Virtual Depth [64.29043589521308]
We propose a rendering module to augment the training data by synthesizing images with virtual-depths.
The rendering module takes as input the RGB image and its corresponding sparse depth image, outputs a variety of photo-realistic synthetic images.
Besides, we introduce an auxiliary module to improve the detection model by jointly optimizing it through a depth estimation task.
arXiv Detail & Related papers (2021-07-28T11:00:47Z) - EdgeConv with Attention Module for Monocular Depth Estimation [4.239147046986999]
To generate accurate depth maps, it is important for the model to learn structural information about the scene.
We propose a novel Patch-Wise EdgeConv Module (PEM) and EdgeConv Attention Module (EAM) to solve the difficulty of monocular depth estimation.
Our method is evaluated on two popular datasets, the NYU Depth V2 and the KITTI split, achieving state-of-the-art performance.
arXiv Detail & Related papers (2021-06-16T08:15:20Z) - Boosting Monocular Depth Estimation Models to High-Resolution via
Content-Adaptive Multi-Resolution Merging [14.279471205248534]
We show how a consistent scene structure and high-frequency details affect depth estimation performance.
We present a double estimation method that improves the whole-image depth estimation and a patch selection method that adds local details.
We demonstrate that by merging estimations at different resolutions with changing context, we can generate multi-megapixel depth maps with a high level of detail.
arXiv Detail & Related papers (2021-05-28T17:55:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.