Related papers: Jointly Optimized Global-Local Visual Localization of UAVs

Jointly Optimized Global-Local Visual Localization of UAVs

URL: http://arxiv.org/abs/2310.08082v1
Date: Thu, 12 Oct 2023 07:12:20 GMT
Title: Jointly Optimized Global-Local Visual Localization of UAVs
Authors: Haoling Li, Jiuniu Wang, Zhiwei Wei, Wenjia Xu
Abstract summary: Navigation and localization of UAVs present a challenge when global navigation satellite systems (GNSS) are disrupted and unreliable. Existing visual localization methods achieve autonomous visual localization without error accumulation by matching with ortho satellite images. We propose a novel Global-Local Visual localization (GLVL) network, combining a large-scale retrieval module that finds similar regions with the UAV flight scene, and a fine-grained matching module that localizes the precise UAV coordinate. Our method achieves a localization error of only 2.39 meters in 0.48 seconds in a village scene with sparse texture features.
Score: 17.83193033936859
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Navigation and localization of UAVs present a challenge when global navigation satellite systems (GNSS) are disrupted and unreliable. Traditional techniques, such as simultaneous localization and mapping (SLAM) and visual odometry (VO), exhibit certain limitations in furnishing absolute coordinates and mitigating error accumulation. Existing visual localization methods achieve autonomous visual localization without error accumulation by matching with ortho satellite images. However, doing so cannot guarantee real-time performance due to the complex matching process. To address these challenges, we propose a novel Global-Local Visual Localization (GLVL) network. Our GLVL network is a two-stage visual localization approach, combining a large-scale retrieval module that finds similar regions with the UAV flight scene, and a fine-grained matching module that localizes the precise UAV coordinate, enabling real-time and precise localization. The training process is jointly optimized in an end-to-end manner to further enhance the model capability. Experiments on six UAV flight scenes encompassing both texture-rich and texture-sparse regions demonstrate the ability of our model to achieve the real-time precise localization requirements of UAVs. Particularly, our method achieves a localization error of only 2.39 meters in 0.48 seconds in a village scene with sparse texture features.

Related papers

Beyond Ground: Map-Free LiDAR Relocalization for UAVs [33.32926994694318]
Map-free LiDAR relocalization is an effective solution for achieving high-precision positioning in environments with weak or unavailable signals.<n>We propose MAILS, a novel map-free LiDAR relocalization framework for UAVs.<n>Our method achieves satisfactory localization precision and consistently outperforms existing techniques by a significant margin.
arXiv Detail & Related papers (2026-02-04T05:36:14Z)
VVLoc: Prior-free 3-DoF Vehicle Visual Localization [6.151313455860856]
We propose a unified pipeline that employs a single neural network to concurrently achieve topological and metric vehicle localization using multi-camera system.<n> VVLoc first evaluates the geo-proximity between visual observations, then estimates their relative metric poses using a matching strategy, while also providing a confidence measure.<n>We evaluate VVLoc not only on the publicly available datasets, but also on a more challenging self-collected dataset.
arXiv Detail & Related papers (2026-01-31T16:37:30Z)
Object Detection as an Optional Basis: A Graph Matching Network for Cross-View UAV Localization [17.908597896653045]
This paper presents a cross-view UAV localization framework that performs map matching via object detection.<n>In typical pipelines, UAV visual localization is formulated as an image-retrieval problem.<n>Our method achieves strong retrieval and localization performance using a fine-grained, graph-based node-similarity metric.
arXiv Detail & Related papers (2025-11-04T11:25:31Z)
Loc$^2$: Interpretable Cross-View Localization via Depth-Lifted Local Feature Matching [80.57282092735991]
We propose an accurate and interpretable fine-grained cross-view localization method.<n>It estimates the 3 Degrees of Freedom (DoF) pose of a ground-level image by matching its local features with a reference aerial image.<n> Experiments show state-of-the-art accuracy in challenging scenarios such as cross-area testing and unknown orientation.
arXiv Detail & Related papers (2025-09-11T18:52:16Z)
U-ViLAR: Uncertainty-Aware Visual Localization for Autonomous Driving via Differentiable Association and Registration [25.74646789843283]
U-ViLAR is a novel uncertainty-aware visual localization framework.<n>It enables adaptive localization using high-definition (HD) maps or navigation maps.<n>Our model has undergone rigorous testing on large-scale autonomous driving fleets.
arXiv Detail & Related papers (2025-07-06T18:40:42Z)
Hierarchical Image Matching for UAV Absolute Visual Localization via Semantic and Structural Constraints [10.639191465547517]
Absolute localization is crucial for unmanned aerial vehicles (UAVs) in various applications, but it becomes challenging when global navigation satellite system (GNSS) signals are unavailable.<n> Vision-based absolute localization methods, which locate the current view of the UAV in a reference satellite map to estimate its position, have become popular in-denied scenarios.<n>Existing methods mostly rely on traditional and low-level image matching, suffering from difficulties due to significant differences introduced by cross-source discrepancies and temporal variations.<n>We introduce a hierarchical cross-source image matching method designed for UAV absolute localization, which integrates a semantic-aware and
arXiv Detail & Related papers (2025-06-11T13:53:03Z)
SF-Loc: A Visual Mapping and Geo-Localization System based on Sparse Visual Structure Frames [3.5047603107971397]
SF-Loc is a lightweight visual mapping and map-aided localization system. In the mapping phase, multi-sensor dense bundle adjustment (MS-DBA) is applied to construct geo-referenced visual structure frames. In the localization phase, coarse-to-fine vision-based localization is performed, in which multi-frame information and the map distribution are fully integrated.
arXiv Detail & Related papers (2024-12-02T13:51:58Z)
Weakly Supervised Video Anomaly Detection and Localization with Spatio-Temporal Prompts [57.01985221057047]
This paper introduces a novel method that learnstemporal prompt embeddings for weakly supervised video anomaly detection and localization (WSVADL) based on pre-trained vision-language models (VLMs) Our method achieves state-of-theart performance on three public benchmarks for the WSVADL task.
arXiv Detail & Related papers (2024-08-12T03:31:29Z)
GOMAA-Geo: GOal Modality Agnostic Active Geo-localization [49.599465495973654]
We consider the task of active geo-localization (AGL) in which an agent uses a sequence of visual cues observed during aerial navigation to find a target specified through multiple possible modalities. GOMAA-Geo is a goal modality active geo-localization agent for zero-shot generalization between different goal modalities.
arXiv Detail & Related papers (2024-06-04T02:59:36Z)
AGL-NET: Aerial-Ground Cross-Modal Global Localization with Varying Scales [45.315661330785275]
We present AGL-NET, a novel learning-based method for global localization using LiDAR point clouds and satellite maps. We tackle two critical challenges: bridging the representation gap between image and points modalities for robust feature matching, and handling inherent scale discrepancies between global view and local view.
arXiv Detail & Related papers (2024-04-04T04:12:30Z)
UAVD4L: A Large-Scale Dataset for UAV 6-DoF Localization [14.87295056434887]
We introduce a large-scale 6-DoF UAV dataset for localization (UAVD4L) We develop a two-stage 6-DoF localization pipeline (UAVLoc), which consists of offline synthetic data generation and online visual localization. Results on the new dataset demonstrate the effectiveness of the proposed approach.
arXiv Detail & Related papers (2024-01-11T15:19:21Z)
Spatial-Aware Token for Weakly Supervised Object Localization [137.0570026552845]
We propose a task-specific spatial-aware token to condition localization in a weakly supervised manner. Experiments show that the proposed SAT achieves state-of-the-art performance on both CUB-200 and ImageNet, with 98.45% and 73.13% GT-known Loc.
arXiv Detail & Related papers (2023-03-18T15:38:17Z)
BEVBert: Multimodal Map Pre-training for Language-guided Navigation [75.23388288113817]
We propose a new map-based pre-training paradigm that is spatial-aware for use in vision-and-language navigation (VLN) We build a local metric map to explicitly aggregate incomplete observations and remove duplicates, while modeling navigation dependency in a global topological map. Based on the hybrid map, we devise a pre-training framework to learn a multimodal map representation, which enhances spatial-aware cross-modal reasoning thereby facilitating the language-guided navigation goal.
arXiv Detail & Related papers (2022-12-08T16:27:54Z)
Visual Cross-View Metric Localization with Dense Uncertainty Estimates [11.76638109321532]
This work addresses visual cross-view metric localization for outdoor robotics. Given a ground-level color image and a satellite patch that contains the local surroundings, the task is to identify the location of the ground camera within the satellite patch. We devise a novel network architecture with denser satellite descriptors, similarity matching at the bottleneck, and a dense spatial distribution as output to capture multi-modal localization ambiguities.
arXiv Detail & Related papers (2022-08-17T20:12:23Z)
Beyond Cross-view Image Retrieval: Highly Accurate Vehicle Localization Using Satellite Image [91.29546868637911]
This paper addresses the problem of vehicle-mounted camera localization by matching a ground-level image with an overhead-view satellite map. The key idea is to formulate the task as pose estimation and solve it by neural-net based optimization. Experiments on standard autonomous vehicle localization datasets have confirmed the superiority of the proposed method.
arXiv Detail & Related papers (2022-04-10T19:16:58Z)
Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation [87.03299519917019]
We propose a dual-scale graph transformer (DUET) for joint long-term action planning and fine-grained cross-modal understanding. We build a topological map on-the-fly to enable efficient exploration in global action space. The proposed approach, DUET, significantly outperforms state-of-the-art methods on goal-oriented vision-and-language navigation benchmarks.
arXiv Detail & Related papers (2022-02-23T19:06:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.