An Online Adaptation Method for Robust Depth Estimation and Visual Odometry in the Open World
- URL: http://arxiv.org/abs/2504.11698v1
- Date: Wed, 16 Apr 2025 01:48:10 GMT
- Title: An Online Adaptation Method for Robust Depth Estimation and Visual Odometry in the Open World
- Authors: Xingwu Ji, Haochen Niu, Dexin Duan, Rendong Ying, Fei Wen, Peilin Liu,
- Abstract summary: We develop a visual odometry system that can adapt to diverse novel environments in an online manner.<n>We construct an objective for self-supervised learning of the depth estimation module based on the output of the visual odometry system.<n>We demonstrate the robustness and generalization capability of the proposed method in comparison with state-of-the-art learning-based approaches on urban, in-house datasets and a robot platform.
- Score: 16.387434563802532
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, learning-based robotic navigation systems have gained extensive research attention and made significant progress. However, the diversity of open-world scenarios poses a major challenge for the generalization of such systems to practical scenarios. Specifically, learned systems for scene measurement and state estimation tend to degrade when the application scenarios deviate from the training data, resulting to unreliable depth and pose estimation. Toward addressing this problem, this work aims to develop a visual odometry system that can fast adapt to diverse novel environments in an online manner. To this end, we construct a self-supervised online adaptation framework for monocular visual odometry aided by an online-updated depth estimation module. Firstly, we design a monocular depth estimation network with lightweight refiner modules, which enables efficient online adaptation. Then, we construct an objective for self-supervised learning of the depth estimation module based on the output of the visual odometry system and the contextual semantic information of the scene. Specifically, a sparse depth densification module and a dynamic consistency enhancement module are proposed to leverage camera poses and contextual semantics to generate pseudo-depths and valid masks for the online adaptation. Finally, we demonstrate the robustness and generalization capability of the proposed method in comparison with state-of-the-art learning-based approaches on urban, in-house datasets and a robot platform. Code is publicly available at: https://github.com/jixingwu/SOL-SLAM.
Related papers
- Multi-view Reconstruction via SfM-guided Monocular Depth Estimation [92.89227629434316]
We present a new method for multi-view geometric reconstruction.
We incorporate SfM information, a strong multi-view prior, into the depth estimation process.
Our method significantly improves the quality of depth estimation compared to previous monocular depth estimation works.
arXiv Detail & Related papers (2025-03-18T17:54:06Z) - ScaleDepth: Decomposing Metric Depth Estimation into Scale Prediction and Relative Depth Estimation [62.600382533322325]
We propose a novel monocular depth estimation method called ScaleDepth.
Our method decomposes metric depth into scene scale and relative depth, and predicts them through a semantic-aware scale prediction module.
Our method achieves metric depth estimation for both indoor and outdoor scenes in a unified framework.
arXiv Detail & Related papers (2024-07-11T05:11:56Z) - Texture-guided Saliency Distilling for Unsupervised Salient Object
Detection [67.10779270290305]
We propose a novel USOD method to mine rich and accurate saliency knowledge from both easy and hard samples.
Our method achieves state-of-the-art USOD performance on RGB, RGB-D, RGB-T, and video SOD benchmarks.
arXiv Detail & Related papers (2022-07-13T02:01:07Z) - Improving Monocular Visual Odometry Using Learned Depth [84.05081552443693]
We propose a framework to exploit monocular depth estimation for improving visual odometry (VO)
The core of our framework is a monocular depth estimation module with a strong generalization capability for diverse scenes.
Compared with current learning-based VO methods, our method demonstrates a stronger generalization ability to diverse scenes.
arXiv Detail & Related papers (2022-04-04T06:26:46Z) - SelfTune: Metrically Scaled Monocular Depth Estimation through
Self-Supervised Learning [53.78813049373321]
We propose a self-supervised learning method for the pre-trained supervised monocular depth networks to enable metrically scaled depth estimation.
Our approach is useful for various applications such as mobile robot navigation and is applicable to diverse environments.
arXiv Detail & Related papers (2022-03-10T12:28:42Z) - Robust Visual Odometry Using Position-Aware Flow and Geometric Bundle
Adjustment [16.04240592057438]
A novel optical flow network (PANet) built on a position-aware mechanism is proposed first.
Then, a novel system that jointly estimates depth, optical flow, and ego-motion without a typical network to learning ego-motion is proposed.
Experiments show that the proposed system not only outperforms other state-of-the-art methods in terms of depth, flow, and VO estimation.
arXiv Detail & Related papers (2021-11-22T12:05:27Z) - An Adaptive Framework for Learning Unsupervised Depth Completion [59.17364202590475]
We present a method to infer a dense depth map from a color image and associated sparse depth measurements.
We show that regularization and co-visibility are related via the fitness of the model to data and can be unified into a single framework.
arXiv Detail & Related papers (2021-06-06T02:27:55Z) - Generalizing to the Open World: Deep Visual Odometry with Online
Adaptation [27.22639812204019]
We propose an online adaptation framework for deep VO with the assistance of scene-agnostic geometric computations and Bayesian inference.
Our method achieves state-of-the-art generalization ability among self-supervised VO methods.
arXiv Detail & Related papers (2021-03-29T02:13:56Z) - Continual Adaptation for Deep Stereo [52.181067640300014]
We propose a continual adaptation paradigm for deep stereo networks designed to deal with challenging and ever-changing environments.
In our paradigm, the learning signals needed to continuously adapt models online can be sourced from self-supervision via right-to-left image warping or from traditional stereo algorithms.
Our network architecture and adaptation algorithms realize the first real-time self-adaptive deep stereo system.
arXiv Detail & Related papers (2020-07-10T08:15:58Z) - Towards Better Generalization: Joint Depth-Pose Learning without PoseNet [36.414471128890284]
We tackle the essential problem of scale inconsistency for self-supervised joint depth-pose learning.
Most existing methods assume that a consistent scale of depth and pose can be learned across all input samples.
We propose a novel system that explicitly disentangles scale from the network estimation.
arXiv Detail & Related papers (2020-04-03T00:28:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.