View Distribution Alignment with Progressive Adversarial Learning for
UAV Visual Geo-Localization
- URL: http://arxiv.org/abs/2401.01573v1
- Date: Wed, 3 Jan 2024 06:58:09 GMT
- Title: View Distribution Alignment with Progressive Adversarial Learning for
UAV Visual Geo-Localization
- Authors: Cuiwei Liu, Jiahao Liu, Huaijun Qiu, Zhaokui Li and Xiangbin Shi
- Abstract summary: Unmanned Aerial Vehicle (UAV) visual geo-localization aims to match images of the same geographic target captured from different views, i.e., the UAV view and the satellite view.
Previous works map images captured by UAVs and satellites to a shared feature space and employ a classification framework to learn location-dependent features.
This paper introduces distribution alignment of the two views to shorten their distance in a common space.
- Score: 10.442998017077795
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unmanned Aerial Vehicle (UAV) visual geo-localization aims to match images of
the same geographic target captured from different views, i.e., the UAV view
and the satellite view. It is very challenging due to the large appearance
differences in UAV-satellite image pairs. Previous works map images captured by
UAVs and satellites to a shared feature space and employ a classification
framework to learn location-dependent features while neglecting the overall
distribution shift between the UAV view and the satellite view. In this paper,
we address these limitations by introducing distribution alignment of the two
views to shorten their distance in a common space. Specifically, we propose an
end-to-end network, called PVDA (Progressive View Distribution Alignment).
During training, feature encoder, location classifier, and view discriminator
are jointly optimized by a novel progressive adversarial learning strategy.
Competition between feature encoder and view discriminator prompts both of them
to be stronger. It turns out that the adversarial learning is progressively
emphasized until UAV-view images are indistinguishable from satellite-view
images. As a result, the proposed PVDA becomes powerful in learning
location-dependent yet view-invariant features with good scalability towards
unseen images of new locations. Compared to the state-of-the-art methods, the
proposed PVDA requires less inference time but has achieved superior
performance on the University-1652 dataset.
Related papers
- Unsupervised Multi-view UAV Image Geo-localization via Iterative Rendering [31.716967688739036]
Unmanned Aerial Vehicle (UAV) Cross-View Geo-Localization (CVGL) presents significant challenges.
Existing methods rely on the supervision of labeled datasets to extract viewpoint-invariant features for cross-view retrieval.
We propose an unsupervised solution that lifts the scene representation to 3d space from UAV observations for satellite image generation.
arXiv Detail & Related papers (2024-11-22T09:22:39Z) - Game4Loc: A UAV Geo-Localization Benchmark from Game Data [0.0]
We introduce a more practical UAV geo-localization task including partial matches of cross-view paired data.
Experiments demonstrate the effectiveness of our data and training method for UAV geo-localization.
arXiv Detail & Related papers (2024-09-25T13:33:28Z) - Style Alignment based Dynamic Observation Method for UAV-View Geo-localization [7.185123213523453]
We propose a style alignment based dynamic observation method for UAV-view geo-localization.
Specifically, we introduce a style alignment strategy to transfrom the diverse visual style of drone-view images into a unified satellite images visual style.
A dynamic observation module is designed to evaluate the spatial distribution of images by mimicking human observation habits.
arXiv Detail & Related papers (2024-07-03T06:19:42Z) - Breaking the Frame: Visual Place Recognition by Overlap Prediction [53.17564423756082]
We propose a novel visual place recognition approach based on overlap prediction, called VOP.
VOP proceeds co-visible image sections by obtaining patch-level embeddings using a Vision Transformer backbone.
Our approach uses a voting mechanism to assess overlap scores for potential database images.
arXiv Detail & Related papers (2024-06-23T20:00:20Z) - UAV-VisLoc: A Large-scale Dataset for UAV Visual Localization [20.37586403749362]
We present a large-scale dataset, UAV-VisLoc, to facilitate the UAV visual localization task.
Our dataset includes 6,742 drone images and 11 satellite maps, with metadata such as latitude, longitude, altitude, and capture date.
arXiv Detail & Related papers (2024-05-20T10:24:10Z) - CricaVPR: Cross-image Correlation-aware Representation Learning for Visual Place Recognition [73.51329037954866]
We propose a robust global representation method with cross-image correlation awareness for visual place recognition.
Our method uses the attention mechanism to correlate multiple images within a batch.
Our method outperforms state-of-the-art methods by a large margin with significantly less training time.
arXiv Detail & Related papers (2024-02-29T15:05:11Z) - Remote Sensing Vision-Language Foundation Models without Annotations via
Ground Remote Alignment [61.769441954135246]
We introduce a method to train vision-language models for remote-sensing images without using any textual annotations.
Our key insight is to use co-located internet imagery taken on the ground as an intermediary for connecting remote-sensing images and language.
arXiv Detail & Related papers (2023-12-12T03:39:07Z) - Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve
Aerial Visual Perception? [57.77643186237265]
We present Multiview Aerial Visual RECognition or MAVREC, a video dataset where we record synchronized scenes from different perspectives.
MAVREC consists of around 2.5 hours of industry-standard 2.7K resolution video sequences, more than 0.5 million frames, and 1.1 million annotated bounding boxes.
This makes MAVREC the largest ground and aerial-view dataset, and the fourth largest among all drone-based datasets.
arXiv Detail & Related papers (2023-12-07T18:59:14Z) - Orientation-Guided Contrastive Learning for UAV-View Geo-Localisation [0.0]
We present an orientation-guided training framework for UAV-view geo-localisation.
We experimentally demonstrate that this prediction supports the training and outperforms previous approaches.
We achieve state-of-the-art results on both the University-1652 and University-160k datasets.
arXiv Detail & Related papers (2023-08-02T07:32:32Z) - Geography-Aware Self-Supervised Learning [79.4009241781968]
We show that due to their different characteristics, a non-trivial gap persists between contrastive and supervised learning on standard benchmarks.
We propose novel training methods that exploit the spatially aligned structure of remote sensing data.
Our experiments show that our proposed method closes the gap between contrastive and supervised learning on image classification, object detection and semantic segmentation for remote sensing.
arXiv Detail & Related papers (2020-11-19T17:29:13Z) - Where am I looking at? Joint Location and Orientation Estimation by
Cross-View Matching [95.64702426906466]
Cross-view geo-localization is a problem given a large-scale database of geo-tagged aerial images.
Knowing orientation between ground and aerial images can significantly reduce matching ambiguity between these two views.
We design a Dynamic Similarity Matching network to estimate cross-view orientation alignment during localization.
arXiv Detail & Related papers (2020-05-08T05:21:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.