GeoDTR+: Toward generic cross-view geolocalization via geometric
disentanglement
- URL: http://arxiv.org/abs/2308.09624v1
- Date: Fri, 18 Aug 2023 15:32:01 GMT
- Title: GeoDTR+: Toward generic cross-view geolocalization via geometric
disentanglement
- Authors: Xiaohan Zhang, Xingyu Li, Waqas Sultani, Chen Chen, and Safwan Wshah
- Abstract summary: Cross-View Geo-Localization (CVGL) estimates the location of a ground image by matching it to a geo-tagged aerial image in a database.
Recent works achieve outstanding progress on CVGL benchmarks.
Existing methods still suffer from poor performance in cross-area evaluation.
We attribute this deficiency to the lack of ability to extract the geometric layout of visual features and models' overfitting to low-level details.
- Score: 20.346145927174373
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cross-View Geo-Localization (CVGL) estimates the location of a ground image
by matching it to a geo-tagged aerial image in a database. Recent works achieve
outstanding progress on CVGL benchmarks. However, existing methods still suffer
from poor performance in cross-area evaluation, in which the training and
testing data are captured from completely distinct areas. We attribute this
deficiency to the lack of ability to extract the geometric layout of visual
features and models' overfitting to low-level details. Our preliminary work
introduced a Geometric Layout Extractor (GLE) to capture the geometric layout
from input features. However, the previous GLE does not fully exploit
information in the input feature. In this work, we propose GeoDTR+ with an
enhanced GLE module that better models the correlations among visual features.
To fully explore the LS techniques from our preliminary work, we further
propose Contrastive Hard Samples Generation (CHSG) to facilitate model
training. Extensive experiments show that GeoDTR+ achieves state-of-the-art
(SOTA) results in cross-area evaluation on CVUSA, CVACT, and VIGOR by a large
margin ($16.44\%$, $22.71\%$, and $17.02\%$ without polar transformation) while
keeping the same-area performance comparable to existing SOTA. Moreover, we
provide detailed analyses of GeoDTR+.
Related papers
- CurriculumLoc: Enhancing Cross-Domain Geolocalization through
Multi-Stage Refinement [11.108860387261508]
Visual geolocalization is a cost-effective and scalable task that involves matching one or more query images taken at some unknown location, to a set of geo-tagged reference images.
We develop CurriculumLoc, a novel keypoint detection and description with global semantic awareness and a local geometric verification.
We achieve new high recall@1 scores of 62.6% and 94.5% on ALTO, with two different distances metrics, respectively.
arXiv Detail & Related papers (2023-11-20T08:40:01Z) - GeoLLM: Extracting Geospatial Knowledge from Large Language Models [49.20315582673223]
We present GeoLLM, a novel method that can effectively extract geospatial knowledge from large language models.
We demonstrate the utility of our approach across multiple tasks of central interest to the international community, including the measurement of population density and economic livelihoods.
Our experiments reveal that LLMs are remarkably sample-efficient, rich in geospatial information, and robust across the globe.
arXiv Detail & Related papers (2023-10-10T00:03:23Z) - GeoCLIP: Clip-Inspired Alignment between Locations and Images for
Effective Worldwide Geo-localization [61.10806364001535]
Worldwide Geo-localization aims to pinpoint the precise location of images taken anywhere on Earth.
Existing approaches divide the globe into discrete geographic cells, transforming the problem into a classification task.
We propose GeoCLIP, a novel CLIP-inspired Image-to-GPS retrieval approach that enforces alignment between the image and its corresponding GPS locations.
arXiv Detail & Related papers (2023-09-27T20:54:56Z) - Assessment of a new GeoAI foundation model for flood inundation mapping [4.312965283062856]
This paper evaluates the performance of the first-of-its-kind geospatial foundation model, IBM-NASA's Prithvi, to support a crucial geospatial analysis task: flood inundation mapping.
A benchmark dataset, Sen1Floods11, is used in the experiments, and the models' predictability, generalizability, and transferability are evaluated.
Results show the good transferability of the Prithvi model, highlighting its performance advantages in segmenting flooded areas in previously unseen regions.
arXiv Detail & Related papers (2023-09-25T19:50:47Z) - Cross-View Visual Geo-Localization for Outdoor Augmented Reality [11.214903134756888]
We address the problem of geo-pose estimation by cross-view matching of query ground images to a geo-referenced aerial satellite image database.
We propose a new transformer neural network-based model and a modified triplet ranking loss for joint location and orientation estimation.
Experiments on several benchmark cross-view geo-localization datasets show that our model achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-03-28T01:58:03Z) - Sample4Geo: Hard Negative Sampling For Cross-View Geo-Localisation [2.3020018305241337]
We present a simplified but effective architecture based on contrastive learning with symmetric InfoNCE loss.
Our framework consists of a narrow training pipeline that eliminates the need of using aggregation modules.
Our work shows excellent performance on common cross-view datasets like CVUSA, CVACT, University-1652 and VIGOR.
arXiv Detail & Related papers (2023-03-21T13:49:49Z) - Cross-view Geo-localization via Learning Disentangled Geometric Layout
Correspondence [11.823147814005411]
Cross-view geo-localization aims to estimate the location of a query ground image by matching it to a reference geo-tagged aerial images database.
Recent works achieve outstanding progress on cross-view geo-localization benchmarks.
However, existing methods still suffer from poor performance on the cross-area benchmarks.
arXiv Detail & Related papers (2022-12-08T04:54:01Z) - Accurate 3-DoF Camera Geo-Localization via Ground-to-Satellite Image
Matching [102.39635336450262]
We address the problem of ground-to-satellite image geo-localization by matching a query image captured at the ground level against a large-scale database with geotagged satellite images.
Our new method is able to achieve the fine-grained location of a query image, up to pixel size precision of the satellite image.
arXiv Detail & Related papers (2022-03-26T20:10:38Z) - PGL: Prior-Guided Local Self-supervised Learning for 3D Medical Image
Segmentation [87.50205728818601]
We propose a PriorGuided Local (PGL) self-supervised model that learns the region-wise local consistency in the latent feature space.
Our PGL model learns the distinctive representations of local regions, and hence is able to retain structural information.
arXiv Detail & Related papers (2020-11-25T11:03:11Z) - Mix Dimension in Poincar\'{e} Geometry for 3D Skeleton-based Action
Recognition [57.98278794950759]
Graph Convolutional Networks (GCNs) have already demonstrated their powerful ability to model the irregular data.
We present a novel spatial-temporal GCN architecture which is defined via the Poincar'e geometry.
We evaluate our method on two current largest scale 3D datasets.
arXiv Detail & Related papers (2020-07-30T18:23:18Z) - PUGeo-Net: A Geometry-centric Network for 3D Point Cloud Upsampling [103.09504572409449]
We propose a novel deep neural network based method, called PUGeo-Net, to generate uniform dense point clouds.
Thanks to its geometry-centric nature, PUGeo-Net works well for both CAD models with sharp features and scanned models with rich geometric details.
arXiv Detail & Related papers (2020-02-24T14:13:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.