Without Paired Labeled Data: An End-to-End Self-Supervised Paradigm for UAV-View Geo-Localization
- URL: http://arxiv.org/abs/2502.11381v1
- Date: Mon, 17 Feb 2025 02:53:08 GMT
- Title: Without Paired Labeled Data: An End-to-End Self-Supervised Paradigm for UAV-View Geo-Localization
- Authors: Zhongwei Chen, Zhao-Xu Yang, Hai-Jun Rong,
- Abstract summary: UAV-View Geo-Localization aims to ascertain the precise location of a UAV by retrieving the most similar GPS-tagged satellite image.
Existing methods rely on supervised learning paradigms that necessitate annotated paired data for training.
We propose the Dynamic Memory-Driven and Neighborhood Information Learning network, a lightweight end-to-end self-supervised framework for UAV-view geo-localization.
- Score: 2.733505168507872
- License:
- Abstract: UAV-View Geo-Localization (UVGL) aims to ascertain the precise location of a UAV by retrieving the most similar GPS-tagged satellite image. However, existing methods predominantly rely on supervised learning paradigms that necessitate annotated paired data for training, which incurs substantial annotation costs and impedes large-scale deployment. To overcome this limitation, we propose the Dynamic Memory-Driven and Neighborhood Information Learning (DMNIL) network, a lightweight end-to-end self-supervised framework for UAV-view geo-localization. The DMNIL framework utilizes a dual-path clustering-based contrastive learning architecture as its baseline to model intra-view structural relationships, enhancing feature consistency and discriminability. Additionally, a dynamic memory-driven hierarchical learning module is proposed to progressively mine local and global information, reinforcing multi-level feature associations to improve model robustness. To bridge the domain gap between UAV and satellite views, we design an information-consistent evolutionary learning mechanism that systematically explores latent correlations within intra-view neighborhoods and across cross-view domains, ultimately constructing a unified cross-view feature representation space. Extensive experiments on three benchmarks (University-1652, SUES-200, and DenseUAV) demonstrate that DMNIL achieves competitive performance against state-of-the-art supervised methods while maintaining computational efficiency. Notably, this superiority is attained without relying on paired training data, underscoring the framework's practicality for real-world deployment. Codes will be released soon.
Related papers
- Precise GPS-Denied UAV Self-Positioning via Context-Enhanced Cross-View Geo-Localization [10.429391988135345]
We propose the Context-Enhanced method for precise UAV Self-Positioning (CEUSP), specifically designed for UAV self-positioning tasks.
CEUSP integrates a Dynamic Sampling Strategy (DSS) to efficiently select optimal negative samples, while the Rubik's Cube Attention (RCA) module, combined with the Context-Aware Channel Integration (CACI) module, enhances feature representation and discrimination.
Our approach achieves state-of-the-art performance on the DenseUAV dataset, which is specifically designed for dense urban contexts.
arXiv Detail & Related papers (2025-02-17T03:49:18Z) - SaliencyI2PLoc: saliency-guided image-point cloud localization using contrastive learning [17.29563451509921]
SaliencyI2PLoc is a contrastive learning architecture that fuses the saliency map into feature aggregation.
Our method achieves a Recall@1 of 78.92% and a Recall@20 of 97.59% on the urban scenario evaluation dataset.
arXiv Detail & Related papers (2024-12-20T05:20:10Z) - Into the Unknown: Generating Geospatial Descriptions for New Environments [18.736071151303726]
Rendezvous task requires reasoning over allocentric spatial relationships.
Using opensource descriptions paired with coordinates (e.g., Wikipedia) provides training data but suffers from limited spatially-oriented text.
We propose a large-scale augmentation method for generating high-quality synthetic data for new environments.
arXiv Detail & Related papers (2024-06-28T14:56:21Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - BEVBert: Multimodal Map Pre-training for Language-guided Navigation [75.23388288113817]
We propose a new map-based pre-training paradigm that is spatial-aware for use in vision-and-language navigation (VLN)
We build a local metric map to explicitly aggregate incomplete observations and remove duplicates, while modeling navigation dependency in a global topological map.
Based on the hybrid map, we devise a pre-training framework to learn a multimodal map representation, which enhances spatial-aware cross-modal reasoning thereby facilitating the language-guided navigation goal.
arXiv Detail & Related papers (2022-12-08T16:27:54Z) - Co-visual pattern augmented generative transformer learning for
automobile geo-localization [12.449657263683337]
Cross-view geo-localization (CVGL) aims to estimate the geographical location of the ground-level camera by matching against enormous geo-tagged aerial images.
We present a novel approach using cross-view knowledge generative techniques in combination with transformers, namely mutual generative transformer learning (MGTL) for CVGL.
arXiv Detail & Related papers (2022-03-17T07:29:02Z) - Towards Scale Consistent Monocular Visual Odometry by Learning from the
Virtual World [83.36195426897768]
We propose VRVO, a novel framework for retrieving the absolute scale from virtual data.
We first train a scale-aware disparity network using both monocular real images and stereo virtual data.
The resulting scale-consistent disparities are then integrated with a direct VO system.
arXiv Detail & Related papers (2022-03-11T01:51:54Z) - Trajectory Design for UAV-Based Internet-of-Things Data Collection: A
Deep Reinforcement Learning Approach [93.67588414950656]
In this paper, we investigate an unmanned aerial vehicle (UAV)-assisted Internet-of-Things (IoT) system in a 3D environment.
We present a TD3-based trajectory design for completion time minimization (TD3-TDCTM) algorithm.
Our simulation results show the superiority of the proposed TD3-TDCTM algorithm over three conventional non-learning based baseline methods.
arXiv Detail & Related papers (2021-07-23T03:33:29Z) - Multiple Object Tracking with Correlation Learning [16.959379957515974]
We propose to exploit the local correlation module to model the topological relationship between targets and their surrounding environment.
Specifically, we establish dense correspondences of each spatial location and its context, and explicitly constrain the correlation volumes through self-supervised learning.
Our approach demonstrates the effectiveness of correlation learning with the superior performance and obtains state-of-the-art MOTA of 76.5% and IDF1 of 73.6% on MOT17.
arXiv Detail & Related papers (2021-04-08T06:48:02Z) - Edge-assisted Democratized Learning Towards Federated Analytics [67.44078999945722]
We show the hierarchical learning structure of the proposed edge-assisted democratized learning mechanism, namely Edge-DemLearn.
We also validate Edge-DemLearn as a flexible model training mechanism to build a distributed control and aggregation methodology in regions.
arXiv Detail & Related papers (2020-12-01T11:46:03Z) - Pairwise Similarity Knowledge Transfer for Weakly Supervised Object
Localization [53.99850033746663]
We study the problem of learning localization model on target classes with weakly supervised image labels.
In this work, we argue that learning only an objectness function is a weak form of knowledge transfer.
Experiments on the COCO and ILSVRC 2013 detection datasets show that the performance of the localization model improves significantly with the inclusion of pairwise similarity function.
arXiv Detail & Related papers (2020-03-18T17:53:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.