View Distribution Alignment with Progressive Adversarial Learning for
UAV Visual Geo-Localization
- URL: http://arxiv.org/abs/2401.01573v1
- Date: Wed, 3 Jan 2024 06:58:09 GMT
- Title: View Distribution Alignment with Progressive Adversarial Learning for
UAV Visual Geo-Localization
- Authors: Cuiwei Liu, Jiahao Liu, Huaijun Qiu, Zhaokui Li and Xiangbin Shi
- Abstract summary: Unmanned Aerial Vehicle (UAV) visual geo-localization aims to match images of the same geographic target captured from different views, i.e., the UAV view and the satellite view.
Previous works map images captured by UAVs and satellites to a shared feature space and employ a classification framework to learn location-dependent features.
This paper introduces distribution alignment of the two views to shorten their distance in a common space.
- Score: 10.442998017077795
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unmanned Aerial Vehicle (UAV) visual geo-localization aims to match images of
the same geographic target captured from different views, i.e., the UAV view
and the satellite view. It is very challenging due to the large appearance
differences in UAV-satellite image pairs. Previous works map images captured by
UAVs and satellites to a shared feature space and employ a classification
framework to learn location-dependent features while neglecting the overall
distribution shift between the UAV view and the satellite view. In this paper,
we address these limitations by introducing distribution alignment of the two
views to shorten their distance in a common space. Specifically, we propose an
end-to-end network, called PVDA (Progressive View Distribution Alignment).
During training, feature encoder, location classifier, and view discriminator
are jointly optimized by a novel progressive adversarial learning strategy.
Competition between feature encoder and view discriminator prompts both of them
to be stronger. It turns out that the adversarial learning is progressively
emphasized until UAV-view images are indistinguishable from satellite-view
images. As a result, the proposed PVDA becomes powerful in learning
location-dependent yet view-invariant features with good scalability towards
unseen images of new locations. Compared to the state-of-the-art methods, the
proposed PVDA requires less inference time but has achieved superior
performance on the University-1652 dataset.
Related papers
- Style Alignment based Dynamic Observation Method for UAV-View Geo-localization [7.185123213523453]
We propose a style alignment based dynamic observation method for UAV-view geo-localization.
Specifically, we introduce a style alignment strategy to transfrom the diverse visual style of drone-view images into a unified satellite images visual style.
A dynamic observation module is designed to evaluate the spatial distribution of images by mimicking human observation habits.
arXiv Detail & Related papers (2024-07-03T06:19:42Z) - UAV-VisLoc: A Large-scale Dataset for UAV Visual Localization [20.37586403749362]
We present a large-scale dataset, UAV-VisLoc, to facilitate the UAV visual localization task.
Our dataset includes 6,742 drone images and 11 satellite maps, with metadata such as latitude, longitude, altitude, and capture date.
arXiv Detail & Related papers (2024-05-20T10:24:10Z) - CricaVPR: Cross-image Correlation-aware Representation Learning for Visual Place Recognition [73.51329037954866]
We propose a robust global representation method with cross-image correlation awareness for visual place recognition.
Our method uses the attention mechanism to correlate multiple images within a batch.
Our method outperforms state-of-the-art methods by a large margin with significantly less training time.
arXiv Detail & Related papers (2024-02-29T15:05:11Z) - Remote Sensing Vision-Language Foundation Models without Annotations via
Ground Remote Alignment [61.769441954135246]
We introduce a method to train vision-language models for remote-sensing images without using any textual annotations.
Our key insight is to use co-located internet imagery taken on the ground as an intermediary for connecting remote-sensing images and language.
arXiv Detail & Related papers (2023-12-12T03:39:07Z) - Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve
Aerial Visual Perception? [57.77643186237265]
We present Multiview Aerial Visual RECognition or MAVREC, a video dataset where we record synchronized scenes from different perspectives.
MAVREC consists of around 2.5 hours of industry-standard 2.7K resolution video sequences, more than 0.5 million frames, and 1.1 million annotated bounding boxes.
This makes MAVREC the largest ground and aerial-view dataset, and the fourth largest among all drone-based datasets.
arXiv Detail & Related papers (2023-12-07T18:59:14Z) - Orientation-Guided Contrastive Learning for UAV-View Geo-Localisation [0.0]
We present an orientation-guided training framework for UAV-view geo-localisation.
We experimentally demonstrate that this prediction supports the training and outperforms previous approaches.
We achieve state-of-the-art results on both the University-1652 and University-160k datasets.
arXiv Detail & Related papers (2023-08-02T07:32:32Z) - CROVIA: Seeing Drone Scenes from Car Perspective via Cross-View
Adaptation [20.476683921252867]
We propose a novel Cross-View Adaptation (CROVIA) approach to adapt the knowledge learned from on-road vehicle views to UAV views.
First, a novel geometry-based constraint to cross-view adaptation is introduced based on the geometry correlation between views.
Second, cross-view correlations from image space are effectively transferred to segmentation space without any requirement of paired on-road and UAV view data.
arXiv Detail & Related papers (2023-04-14T15:20:40Z) - SatMAE: Pre-training Transformers for Temporal and Multi-Spectral
Satellite Imagery [74.82821342249039]
We present SatMAE, a pre-training framework for temporal or multi-spectral satellite imagery based on Masked Autoencoder (MAE)
To leverage temporal information, we include a temporal embedding along with independently masking image patches across time.
arXiv Detail & Related papers (2022-07-17T01:35:29Z) - Vision-Based UAV Self-Positioning in Low-Altitude Urban Environments [20.69412701553767]
Unmanned Aerial Vehicles (UAVs) rely on satellite systems for stable positioning.
In such situations, vision-based techniques can serve as an alternative, ensuring the self-positioning capability of UAVs.
This paper presents a new dataset, DenseUAV, which is the first publicly available dataset designed for the UAV self-positioning task.
arXiv Detail & Related papers (2022-01-23T07:18:55Z) - Geography-Aware Self-Supervised Learning [79.4009241781968]
We show that due to their different characteristics, a non-trivial gap persists between contrastive and supervised learning on standard benchmarks.
We propose novel training methods that exploit the spatially aligned structure of remote sensing data.
Our experiments show that our proposed method closes the gap between contrastive and supervised learning on image classification, object detection and semantic segmentation for remote sensing.
arXiv Detail & Related papers (2020-11-19T17:29:13Z) - Where am I looking at? Joint Location and Orientation Estimation by
Cross-View Matching [95.64702426906466]
Cross-view geo-localization is a problem given a large-scale database of geo-tagged aerial images.
Knowing orientation between ground and aerial images can significantly reduce matching ambiguity between these two views.
We design a Dynamic Similarity Matching network to estimate cross-view orientation alignment during localization.
arXiv Detail & Related papers (2020-05-08T05:21:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.