Cross-View Geo-Localization with Street-View and VHR Satellite Imagery in Decentrality Settings
- URL: http://arxiv.org/abs/2412.11529v2
- Date: Fri, 03 Jan 2025 03:48:03 GMT
- Title: Cross-View Geo-Localization with Street-View and VHR Satellite Imagery in Decentrality Settings
- Authors: Panwang Xia, Lei Yu, Yi Wan, Qiong Wu, Peiqi Chen, Liheng Zhong, Yongxiang Yao, Dong Wei, Xinyi Liu, Lixiang Ru, Yingying Zhang, Jiangwei Lao, Jingdong Chen, Ming Yang, Yongjun Zhang,
- Abstract summary: Cross-View Geo-Localization matches street-view query images with geo-tagged aerial-view reference images.
Decentrality is a critical factor warranting deeper investigation, as larger decentrality can substantially improve localization efficiency but comes at the cost of declines in localization accuracy.
We introduce DReSS, a novel dataset designed to evaluate cross-view geo-localization with a large geographic scope and diverse landscapes.
- Score: 39.252555758596706
- License:
- Abstract: Cross-View Geo-Localization tackles the challenge of image geo-localization in GNSS-denied environments, including disaster response scenarios, urban canyons, and dense forests, by matching street-view query images with geo-tagged aerial-view reference images. However, current research often relies on benchmarks and methods that assume center-aligned settings or account for only limited decentrality, which we define as the offset of the query image relative to the reference image center. Such assumptions fail to reflect real-world scenarios, where reference databases are typically pre-established without the possibility of ensuring perfect alignment for each query image. Moreover, decentrality is a critical factor warranting deeper investigation, as larger decentrality can substantially improve localization efficiency but comes at the cost of declines in localization accuracy. To address this limitation, we introduce DReSS (Decentrality Related Street-view and Satellite-view dataset), a novel dataset designed to evaluate cross-view geo-localization with a large geographic scope and diverse landscapes, emphasizing the decentrality issue. Meanwhile, we propose AuxGeo (Auxiliary Enhanced Geo-Localization) to further study the decentrality issue, which leverages a multi-metric optimization strategy with two novel modules: the Bird's-eye view Intermediary Module (BIM) and the Position Constraint Module (PCM). These modules improve the localization accuracy despite the decentrality problem. Extensive experiments demonstrate that AuxGeo outperforms previous methods on our proposed DReSS dataset, mitigating the issue of large decentrality, and also achieves state-of-the-art performance on existing public datasets such as CVUSA, CVACT, and VIGOR.
Related papers
- CV-Cities: Advancing Cross-View Geo-Localization in Global Cities [3.074201632920997]
Cross-view geo-localization (CVGL) involves matching and retrieving satellite images to determine the geographic location of a ground image.
This task faces significant challenges due to substantial viewpoint discrepancies, the complexity of localization scenarios, and the need for global localization.
We propose a novel CVGL framework that integrates the foundational model DINOv2 with an advanced feature mixer.
arXiv Detail & Related papers (2024-11-19T11:41:22Z) - GeoCLIP: Clip-Inspired Alignment between Locations and Images for
Effective Worldwide Geo-localization [61.10806364001535]
Worldwide Geo-localization aims to pinpoint the precise location of images taken anywhere on Earth.
Existing approaches divide the globe into discrete geographic cells, transforming the problem into a classification task.
We propose GeoCLIP, a novel CLIP-inspired Image-to-GPS retrieval approach that enforces alignment between the image and its corresponding GPS locations.
arXiv Detail & Related papers (2023-09-27T20:54:56Z) - Cross-View Visual Geo-Localization for Outdoor Augmented Reality [11.214903134756888]
We address the problem of geo-pose estimation by cross-view matching of query ground images to a geo-referenced aerial satellite image database.
We propose a new transformer neural network-based model and a modified triplet ranking loss for joint location and orientation estimation.
Experiments on several benchmark cross-view geo-localization datasets show that our model achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-03-28T01:58:03Z) - Visual Cross-View Metric Localization with Dense Uncertainty Estimates [11.76638109321532]
This work addresses visual cross-view metric localization for outdoor robotics.
Given a ground-level color image and a satellite patch that contains the local surroundings, the task is to identify the location of the ground camera within the satellite patch.
We devise a novel network architecture with denser satellite descriptors, similarity matching at the bottleneck, and a dense spatial distribution as output to capture multi-modal localization ambiguities.
arXiv Detail & Related papers (2022-08-17T20:12:23Z) - Beyond Cross-view Image Retrieval: Highly Accurate Vehicle Localization
Using Satellite Image [91.29546868637911]
This paper addresses the problem of vehicle-mounted camera localization by matching a ground-level image with an overhead-view satellite map.
The key idea is to formulate the task as pose estimation and solve it by neural-net based optimization.
Experiments on standard autonomous vehicle localization datasets have confirmed the superiority of the proposed method.
arXiv Detail & Related papers (2022-04-10T19:16:58Z) - Accurate 3-DoF Camera Geo-Localization via Ground-to-Satellite Image
Matching [102.39635336450262]
We address the problem of ground-to-satellite image geo-localization by matching a query image captured at the ground level against a large-scale database with geotagged satellite images.
Our new method is able to achieve the fine-grained location of a query image, up to pixel size precision of the satellite image.
arXiv Detail & Related papers (2022-03-26T20:10:38Z) - City-wide Street-to-Satellite Image Geolocalization of a Mobile Ground
Agent [38.140216125792755]
Cross-view image geolocalization provides an estimate of an agent's global position by matching a local ground image to an overhead satellite image without the need for GPS.
Our approach, called Wide-Area Geolocalization (WAG), combines a neural network with a particle filter to achieve global position estimates for agents moving in GPS-denied environments.
WAG achieves position estimation accuracies on the order of 20 meters, a 98% reduction compared to a baseline training and weighting approach.
arXiv Detail & Related papers (2022-03-10T19:54:12Z) - Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization [54.00111565818903]
Cross-view geo-localization is to spot images of the same geographic target from different platforms.
Existing methods usually concentrate on mining the fine-grained feature of the geographic target in the image center.
We introduce a simple and effective deep neural network, called Local Pattern Network (LPN), to take advantage of contextual information.
arXiv Detail & Related papers (2020-08-26T16:06:11Z) - Zero-Shot Multi-View Indoor Localization via Graph Location Networks [66.05980368549928]
indoor localization is a fundamental problem in location-based applications.
We propose a novel neural network based architecture Graph Location Networks (GLN) to perform infrastructure-free, multi-view image based indoor localization.
GLN makes location predictions based on robust location representations extracted from images through message-passing networks.
We introduce a novel zero-shot indoor localization setting and tackle it by extending the proposed GLN to a dedicated zero-shot version.
arXiv Detail & Related papers (2020-08-06T07:36:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.