U-ViLAR: Uncertainty-Aware Visual Localization for Autonomous Driving via Differentiable Association and Registration
- URL: http://arxiv.org/abs/2507.04503v1
- Date: Sun, 06 Jul 2025 18:40:42 GMT
- Title: U-ViLAR: Uncertainty-Aware Visual Localization for Autonomous Driving via Differentiable Association and Registration
- Authors: Xiaofan Li, Zhihao Xu, Chenming Wu, Zhao Yang, Yumeng Zhang, Jiang-Jiang Liu, Haibao Yu, Fan Duan, Xiaoqing Ye, Yuan Wang, Shirui Li, Xun Sun, Ji Wan, Jun Wang,
- Abstract summary: U-ViLAR is a novel uncertainty-aware visual localization framework.<n>It enables adaptive localization using high-definition (HD) maps or navigation maps.<n>Our model has undergone rigorous testing on large-scale autonomous driving fleets.
- Score: 25.74646789843283
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate localization using visual information is a critical yet challenging task, especially in urban environments where nearby buildings and construction sites significantly degrade GNSS (Global Navigation Satellite System) signal quality. This issue underscores the importance of visual localization techniques in scenarios where GNSS signals are unreliable. This paper proposes U-ViLAR, a novel uncertainty-aware visual localization framework designed to address these challenges while enabling adaptive localization using high-definition (HD) maps or navigation maps. Specifically, our method first extracts features from the input visual data and maps them into Bird's-Eye-View (BEV) space to enhance spatial consistency with the map input. Subsequently, we introduce: a) Perceptual Uncertainty-guided Association, which mitigates errors caused by perception uncertainty, and b) Localization Uncertainty-guided Registration, which reduces errors introduced by localization uncertainty. By effectively balancing the coarse-grained large-scale localization capability of association with the fine-grained precise localization capability of registration, our approach achieves robust and accurate localization. Experimental results demonstrate that our method achieves state-of-the-art performance across multiple localization tasks. Furthermore, our model has undergone rigorous testing on large-scale autonomous driving fleets and has demonstrated stable performance in various challenging urban scenarios.
Related papers
- NOVA: Navigation via Object-Centric Visual Autonomy for High-Speed Target Tracking in Unstructured GPS-Denied Environments [56.35569661650558]
We introduce NOVA, a fully onboard, object-centric framework that enables robust target tracking and collision-aware navigation.<n>Rather than constructing a global map, NOVA formulates perception, estimation, and control entirely in the target's reference frame.<n>We validate NOVA across challenging real-world scenarios, including urban mazes, forest trails, and repeated transitions through buildings with intermittent GPS loss.
arXiv Detail & Related papers (2025-06-23T14:28:30Z) - SegLocNet: Multimodal Localization Network for Autonomous Driving via Bird's-Eye-View Segmentation [0.0]
SegLocNet is a multimodal-free localization network that achieves precise localization using semantic segmentation.<n>Our method can accurately estimate the ego pose in urban environments without relying on generalization.<n>Our code and pre-trained model will be released publicly.
arXiv Detail & Related papers (2025-02-27T13:34:55Z) - MapLocNet: Coarse-to-Fine Feature Registration for Visual Re-Localization in Navigation Maps [8.373285397029884]
Traditional localization approaches rely on high-definition (HD) maps, which consist of precisely annotated landmarks.
We propose a novel transformer-based neural re-localization method, inspired by image registration.
Our method significantly outperforms the current state-of-the-art OrienterNet on both the nuScenes and Argoverse datasets.
arXiv Detail & Related papers (2024-07-11T14:51:18Z) - Monocular Localization with Semantics Map for Autonomous Vehicles [8.242967098897408]
We propose a novel visual semantic localization algorithm that employs stable semantic features instead of low-level texture features.
First, semantic maps are constructed offline by detecting semantic objects, such as ground markers, lane lines, and poles, using cameras or LiDAR sensors.
Online visual localization is performed through data association of semantic features and map objects.
arXiv Detail & Related papers (2024-06-06T08:12:38Z) - View Consistent Purification for Accurate Cross-View Localization [59.48131378244399]
This paper proposes a fine-grained self-localization method for outdoor robotics.
The proposed method addresses limitations in existing cross-view localization methods.
It is the first sparse visual-only method that enhances perception in dynamic environments.
arXiv Detail & Related papers (2023-08-16T02:51:52Z) - Spatial-Aware Token for Weakly Supervised Object Localization [137.0570026552845]
We propose a task-specific spatial-aware token to condition localization in a weakly supervised manner.
Experiments show that the proposed SAT achieves state-of-the-art performance on both CUB-200 and ImageNet, with 98.45% and 73.13% GT-known Loc.
arXiv Detail & Related papers (2023-03-18T15:38:17Z) - Consistency-Aware Anchor Pyramid Network for Crowd Localization [167.93943981468348]
Crowd localization aims to predict the spatial position of humans in a crowd scenario.
We propose an anchor pyramid scheme to adaptively determine the anchor density in each image region.
arXiv Detail & Related papers (2022-12-08T04:32:01Z) - Robust Monocular Localization in Sparse HD Maps Leveraging Multi-Task
Uncertainty Estimation [28.35592701148056]
We present a novel monocular localization approach based on a sliding-window pose graph.
We propose an efficient multi-task uncertainty-aware perception module.
Our approach enables robust and accurate 6D localization in challenging urban scenarios.
arXiv Detail & Related papers (2021-10-20T13:46:15Z) - Real-time Outdoor Localization Using Radio Maps: A Deep Learning
Approach [59.17191114000146]
LocUNet: A convolutional, end-to-end trained neural network (NN) for the localization task.
We show that LocUNet can localize users with state-of-the-art accuracy and enjoys high robustness to inaccuracies in the estimations of radio maps.
arXiv Detail & Related papers (2021-06-23T17:27:04Z) - Deep Multi-Task Learning for Joint Localization, Perception, and
Prediction [68.50217234419922]
This paper investigates the issues that arise in state-of-the-art autonomy stacks under localization error.
We design a system that jointly performs perception, prediction, and localization.
Our architecture is able to reuse computation between both tasks, and is thus able to correct localization errors efficiently.
arXiv Detail & Related papers (2021-01-17T17:20:31Z) - DA4AD: End-to-End Deep Attention-based Visual Localization for
Autonomous Driving [19.02445537167235]
We present a visual localization framework based on novel deep attention aware features for autonomous driving.
Our method achieves a competitive localization accuracy when compared to the LiDAR-based localization solutions.
arXiv Detail & Related papers (2020-03-06T04:34:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.