Relative Distance Guided Dynamic Partition Learning for Scale-Invariant UAV-View Geo-Localization
- URL: http://arxiv.org/abs/2412.11535v2
- Date: Mon, 23 Dec 2024 14:17:30 GMT
- Title: Relative Distance Guided Dynamic Partition Learning for Scale-Invariant UAV-View Geo-Localization
- Authors: Quan Chen, Tingyu Wang, Rongfeng Lu, Bolun Zheng, Zhedong Zheng, Chenggang Yan,
- Abstract summary: UAV-view Geo-Localization(UVGL) presents substantial challenges, particularly due to the disparity in visual appearance between drone-captured imagery and satellite perspectives.
We propose a partition learning framework based on relative distance, which alleviates the dependence on scale consistency while mining fine-grained features.
Our approach achieves superior geo-localization accuracy across various scale-inconsistent scenarios, and exhibits remarkable robustness against scale variations.
- Score: 37.30243235827088
- License:
- Abstract: UAV-view Geo-Localization~(UVGL) presents substantial challenges, particularly due to the disparity in visual appearance between drone-captured imagery and satellite perspectives. Existing methods usually assume consistent scaling factor across different views. Therefore, they adopt predefined partition alignment and extract viewpoint-invariant representation by constructing a variety of part-level features. However, the scaling assumption is not always hold in the real-world scenarios that variations of UAV flight state leads to the scale mismatch of cross-views, resulting in serious performance degradation. To overcome this issue, we propose a partition learning framework based on relative distance, which alleviates the dependence on scale consistency while mining fine-grained features. Specifically, we propose a distance guided dynamic partition learning strategy~(DGDPL), consisting of a square partition strategy and a distance-guided adjustment strategy. The former is utilized to extract fine-grained features and global features in a simple manner. The latter calculates the relative distance ratio between drone- and satellite-view to adjust the partition size, thereby explicitly aligning the semantic information between partition pairs. Furthermore, we propose a saliency-guided refinement strategy to refine part-level features, so as to further improve the retrieval accuracy. Extensive experiments show that our approach achieves superior geo-localization accuracy across various scale-inconsistent scenarios, and exhibits remarkable robustness against scale variations. The code will be released.
Related papers
- Multi-Level Embedding and Alignment Network with Consistency and Invariance Learning for Cross-View Geo-Localization [2.733505168507872]
Cross-View Geo-Localization (CVGL) involves determining the localization of drone images by retrieving the most similar GPS-tagged satellite images.
Existing methods often overlook the problem of increased computational and storage requirements when improving model performance.
We propose a lightweight enhanced alignment network, called the Multi-Level Embedding and Alignment Network (MEAN)
arXiv Detail & Related papers (2024-12-19T13:10:38Z) - ConGeo: Robust Cross-view Geo-localization across Ground View Variations [34.192775134189965]
Cross-view geo-localization aims at localizing a ground-level query image by matching it to its corresponding geo-referenced aerial view.
Existing learning pipelines are orientation-specific or FoV-specific, demanding separate model training for different ground view variations.
We propose ConGeo, a single- and cross-view Contrastive method for Geo-localization.
arXiv Detail & Related papers (2024-03-20T20:37:13Z) - Adaptive Spot-Guided Transformer for Consistent Local Feature Matching [64.30749838423922]
We propose Adaptive Spot-Guided Transformer (ASTR) for local feature matching.
ASTR models the local consistency and scale variations in a unified coarse-to-fine architecture.
arXiv Detail & Related papers (2023-03-29T12:28:01Z) - Adaptive Local-Component-aware Graph Convolutional Network for One-shot
Skeleton-based Action Recognition [54.23513799338309]
We present an Adaptive Local-Component-aware Graph Convolutional Network for skeleton-based action recognition.
Our method provides a stronger representation than the global embedding and helps our model reach state-of-the-art.
arXiv Detail & Related papers (2022-09-21T02:33:07Z) - G$^2$DA: Geometry-Guided Dual-Alignment Learning for RGB-Infrared Person
Re-Identification [3.909938091041451]
RGB-IR person re-identification aims to retrieve person-of-interest between heterogeneous modalities.
This paper presents a Geometry-Guided Dual-Alignment learning framework (G$2$DA) to tackle sample-level modality difference.
arXiv Detail & Related papers (2021-06-15T03:14:31Z) - Foreground-Aware Relation Network for Geospatial Object Segmentation in
High Spatial Resolution Remote Sensing Imagery [6.4901484665257545]
Geospatial object segmentation always faces with larger-scale variation, larger intra-class variance of background, and foreground-background imbalance.
We propose a foreground-aware relation network (FarSeg) from the perspectives of relation-based and optimization-based foreground modeling.
Experiments show that FarSeg is superior to the state-of-the-art general semantic segmentation methods and achieves a better trade-off between speed and accuracy.
arXiv Detail & Related papers (2020-11-19T10:57:43Z) - Multi-view Drone-based Geo-localization via Style and Spatial Alignment [47.95626612936813]
Multi-view multi-source geo-localization serves as an important auxiliary method of GPS positioning by matching drone-view image and satellite-view image with pre-annotated GPS tag.
We propose an elegant orientation-based method to align the patterns and introduce a new branch to extract aligned partial feature.
arXiv Detail & Related papers (2020-06-23T15:44:02Z) - Deep Semantic Matching with Foreground Detection and Cycle-Consistency [103.22976097225457]
We address weakly supervised semantic matching based on a deep network.
We explicitly estimate the foreground regions to suppress the effect of background clutter.
We develop cycle-consistent losses to enforce the predicted transformations across multiple images to be geometrically plausible and consistent.
arXiv Detail & Related papers (2020-03-31T22:38:09Z) - Self-Guided Adaptation: Progressive Representation Alignment for Domain
Adaptive Object Detection [86.69077525494106]
Unsupervised domain adaptation (UDA) has achieved unprecedented success in improving the cross-domain robustness of object detection models.
Existing UDA methods largely ignore the instantaneous data distribution during model learning, which could deteriorate the feature representation given large domain shift.
We propose a Self-Guided Adaptation (SGA) model, target at aligning feature representation and transferring object detection models across domains.
arXiv Detail & Related papers (2020-03-19T13:30:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.