Salient Sparse Visual Odometry With Pose-Only Supervision
- URL: http://arxiv.org/abs/2404.04677v1
- Date: Sat, 6 Apr 2024 16:48:08 GMT
- Title: Salient Sparse Visual Odometry With Pose-Only Supervision
- Authors: Siyu Chen, Kangcheng Liu, Chen Wang, Shenghai Yuan, Jianfei Yang, Lihua Xie,
- Abstract summary: Visual odometry (VO) is vital for navigation of autonomous systems.
Traditional VO methods struggle with challenges like variable lighting and motion blur.
Deep learning-based VO, though more adaptable, can face generalization problems in new environments.
This paper presents a novel hybrid visual odometry (VO) framework that leverages pose-only supervision.
- Score: 45.450357610621985
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual Odometry (VO) is vital for the navigation of autonomous systems, providing accurate position and orientation estimates at reasonable costs. While traditional VO methods excel in some conditions, they struggle with challenges like variable lighting and motion blur. Deep learning-based VO, though more adaptable, can face generalization problems in new environments. Addressing these drawbacks, this paper presents a novel hybrid visual odometry (VO) framework that leverages pose-only supervision, offering a balanced solution between robustness and the need for extensive labeling. We propose two cost-effective and innovative designs: a self-supervised homographic pre-training for enhancing optical flow learning from pose-only labels and a random patch-based salient point detection strategy for more accurate optical flow patch extraction. These designs eliminate the need for dense optical flow labels for training and significantly improve the generalization capability of the system in diverse and challenging environments. Our pose-only supervised method achieves competitive performance on standard datasets and greater robustness and generalization ability in extreme and unseen scenarios, even compared to dense optical flow-supervised state-of-the-art methods.
Related papers
- From Enhancement to Understanding: Build a Generalized Bridge for Low-light Vision via Semantically Consistent Unsupervised Fine-tuning [65.94580484237737]
Low-light enhancement improves image quality for downstream tasks, but existing methods rely on physical or geometric priors.<n>We build a generalized bridge between low-light enhancement and low-light understanding, which we term Generalized Enhancement For Understanding (GEFU)<n>To address the diverse causes of low-light degradation, we leverage pretrained generative diffusion models to optimize images, achieving zero-shot generalization performance.
arXiv Detail & Related papers (2025-07-11T07:51:26Z) - Adaptive Contextual Embedding for Robust Far-View Borehole Detection [2.206623168926072]
In blasting operations, accurately detecting densely distributed tiny boreholes from far-view imagery is critical for operational safety and efficiency.<n>We propose an adaptive detection approach that builds upon existing architectures (e.g., YOLO) by explicitly leveraging consistent embedding representations derived through exponential moving average (EMA)-based statistical updates.<n>Our method introduces three synergistic components: (1) adaptive augmentation utilizing dynamically updated image statistics to robustly handle illumination and texture variations; (2) embedding stabilization to ensure consistent and reliable feature extraction; and (3) contextual refinement leveraging spatial context for improved detection accuracy.
arXiv Detail & Related papers (2025-05-08T07:25:42Z) - BRIGHT-VO: Brightness-Guided Hybrid Transformer for Visual Odometry with Multi-modality Refinement Module [11.898515581215708]
Visual odometry (VO) plays a crucial role in autonomous driving, robotic navigation, and other related tasks.
We introduce BrightVO, a novel VO model based on Transformer architecture, which performs front-end visual feature extraction.
Using pose graph optimization, this module iteratively refines pose estimates to reduce errors and improve both accuracy and robustness.
arXiv Detail & Related papers (2025-01-15T08:50:52Z) - Generalizable Non-Line-of-Sight Imaging with Learnable Physical Priors [52.195637608631955]
Non-line-of-sight (NLOS) imaging has attracted increasing attention due to its potential applications.
Existing NLOS reconstruction approaches are constrained by the reliance on empirical physical priors.
We introduce a novel learning-based solution, comprising two key designs: Learnable Path Compensation (LPC) and Adaptive Phasor Field (APF)
arXiv Detail & Related papers (2024-09-21T04:39:45Z) - Debiasing Multimodal Large Language Models [61.6896704217147]
Large Vision-Language Models (LVLMs) have become indispensable tools in computer vision and natural language processing.
Our investigation reveals a noteworthy bias in the generated content, where the output is primarily influenced by the underlying Large Language Models (LLMs) prior to the input image.
To rectify these biases and redirect the model's focus toward vision information, we introduce two simple, training-free strategies.
arXiv Detail & Related papers (2024-03-08T12:35:07Z) - Optical Flow for Autonomous Driving: Applications, Challenges and
Improvements [0.9023847175654602]
We propose and evaluate training strategies to improve a learning-based optical flow algorithm.
While trained with synthetic data, the model demonstrates strong capabilities to generalize to real world fisheye data.
We propose a novel, generic semi-supervised framework that significantly boosts performances of existing methods in low light.
arXiv Detail & Related papers (2023-01-11T12:01:42Z) - Self-Aligned Concave Curve: Illumination Enhancement for Unsupervised
Adaptation [36.050270650417325]
We propose a learnable illumination enhancement model for high-level vision.
Inspired by real camera response functions, we assume that the illumination enhancement function should be a concave curve.
Our model architecture and training designs mutually benefit each other, forming a powerful unsupervised normal-to-low light adaptation framework.
arXiv Detail & Related papers (2022-10-07T19:32:55Z) - Imposing Consistency for Optical Flow Estimation [73.53204596544472]
Imposing consistency through proxy tasks has been shown to enhance data-driven learning.
This paper introduces novel and effective consistency strategies for optical flow estimation.
arXiv Detail & Related papers (2022-04-14T22:58:30Z) - Low-light Image Enhancement by Retinex Based Algorithm Unrolling and
Adjustment [50.13230641857892]
We propose a new deep learning framework for the low-light image enhancement (LIE) problem.
The proposed framework contains a decomposition network inspired by algorithm unrolling, and adjustment networks considering both global brightness and local brightness sensitivity.
Experiments on a series of typical LIE datasets demonstrated the effectiveness of the proposed method, both quantitatively and visually, as compared with existing methods.
arXiv Detail & Related papers (2022-02-12T03:59:38Z) - Generalizing to the Open World: Deep Visual Odometry with Online
Adaptation [27.22639812204019]
We propose an online adaptation framework for deep VO with the assistance of scene-agnostic geometric computations and Bayesian inference.
Our method achieves state-of-the-art generalization ability among self-supervised VO methods.
arXiv Detail & Related papers (2021-03-29T02:13:56Z) - Joint Unsupervised Learning of Optical Flow and Egomotion with Bi-Level
Optimization [59.9673626329892]
We exploit the global relationship between optical flow and camera motion using epipolar geometry.
We use implicit differentiation to enable back-propagation through the lower-level geometric optimization layer independent of its implementation.
arXiv Detail & Related papers (2020-02-26T22:28:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.