Related papers: GeoRanker: Distance-Aware Ranking for Worldwide Image Geolocalization

GeoRanker: Distance-Aware Ranking for Worldwide Image Geolocalization

URL: http://arxiv.org/abs/2505.13731v1
Date: Mon, 19 May 2025 21:04:46 GMT
Title: GeoRanker: Distance-Aware Ranking for Worldwide Image Geolocalization
Authors: Pengyue Jia, Seongheon Park, Song Gao, Xiangyu Zhao, Yixuan Li,
Abstract summary: We propose GeoRanker, a distance-aware ranking framework for image geolocalization.<n>We introduce a multi-order distance loss that ranks both absolute and relative distances, enabling the model to reason over structured spatial relationships.<n>GeoRanker achieves state-of-the-art results on two well-established benchmarks.
Score: 30.983556433953076
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Worldwide image geolocalization-the task of predicting GPS coordinates from images taken anywhere on Earth-poses a fundamental challenge due to the vast diversity in visual content across regions. While recent approaches adopt a two-stage pipeline of retrieving candidates and selecting the best match, they typically rely on simplistic similarity heuristics and point-wise supervision, failing to model spatial relationships among candidates. In this paper, we propose GeoRanker, a distance-aware ranking framework that leverages large vision-language models to jointly encode query-candidate interactions and predict geographic proximity. In addition, we introduce a multi-order distance loss that ranks both absolute and relative distances, enabling the model to reason over structured spatial relationships. To support this, we curate GeoRanking, the first dataset explicitly designed for geographic ranking tasks with multimodal candidate information. GeoRanker achieves state-of-the-art results on two well-established benchmarks (IM2GPS3K and YFCC4K), significantly outperforming current best methods.

Related papers

VLM-Guided Visual Place Recognition for Planet-Scale Geo-Localization [24.433604332415204]
We propose a novel hybrid geo-localization framework that combines the strengths of vision-language models and visual place recognition.<n>We evaluate our approach on multiple geo-localization benchmarks and show that it consistently outperforms prior state-of-the-art methods.
arXiv Detail & Related papers (2025-07-23T12:23:03Z)
EarthMapper: Visual Autoregressive Models for Controllable Bidirectional Satellite-Map Translation [50.433911327489554]
We introduce EarthMapper, a novel framework for controllable satellite-map translation.<n>We also contribute CNSatMap, a large-scale dataset comprising 302,132 precisely aligned satellite-map pairs across 38 Chinese cities.<n> experiments on CNSatMap and the New York dataset demonstrate EarthMapper's superior performance.
arXiv Detail & Related papers (2025-04-28T02:41:12Z)
Geolocation with Real Human Gameplay Data: A Large-Scale Dataset and Human-Like Reasoning Framework [59.42946541163632]
We introduce a comprehensive geolocation framework with three key components.<n>GeoComp, a large-scale dataset; GeoCoT, a novel reasoning method; and GeoEval, an evaluation metric.<n>We demonstrate that GeoCoT significantly boosts geolocation accuracy by up to 25% while enhancing interpretability.
arXiv Detail & Related papers (2025-02-19T14:21:25Z)
G3: An Effective and Adaptive Framework for Worldwide Geolocalization Using Large Multi-Modality Models [40.69217368870192]
We propose a novel framework for worldwide geolocalization based on Retrieval-Augmented Generation (RAG) G3 consists of three steps, i.e., Geo-alignment, Geo-diversification, and Geo-verification. Experiments on two well-established datasets verify the superiority of G3 compared to other state-of-the-art methods.
arXiv Detail & Related papers (2024-05-23T15:37:06Z)
GeoCLIP: Clip-Inspired Alignment between Locations and Images for Effective Worldwide Geo-localization [61.10806364001535]
Worldwide Geo-localization aims to pinpoint the precise location of images taken anywhere on Earth. Existing approaches divide the globe into discrete geographic cells, transforming the problem into a classification task. We propose GeoCLIP, a novel CLIP-inspired Image-to-GPS retrieval approach that enforces alignment between the image and its corresponding GPS locations.
arXiv Detail & Related papers (2023-09-27T20:54:56Z)
Geo-Encoder: A Chunk-Argument Bi-Encoder Framework for Chinese Geographic Re-Ranking [61.60169764507917]
Chinese geographic re-ranking task aims to find the most relevant addresses among retrieved candidates. We propose an innovative framework, namely Geo-Encoder, to more effectively integrate Chinese geographical semantics into re-ranking pipelines.
arXiv Detail & Related papers (2023-09-04T13:44:50Z)
GeoNet: Benchmarking Unsupervised Adaptation across Geographies [71.23141626803287]
We study the problem of geographic robustness and make three main contributions. First, we introduce a large-scale dataset GeoNet for geographic adaptation. Second, we hypothesize that the major source of domain shifts arise from significant variations in scene context. Third, we conduct an extensive evaluation of several state-of-the-art unsupervised domain adaptation algorithms and architectures.
arXiv Detail & Related papers (2023-03-27T17:59:34Z)
MGeo: Multi-Modal Geographic Pre-Training Method [49.78466122982627]
We propose a novel query-POI matching method Multi-modal Geographic language model (MGeo) MGeo represents GC as a new modality and is able to fully extract multi-modal correlations for accurate query-POI matching. Our proposed multi-modal pre-training method can significantly improve the query-POI matching capability of generic PTMs.
arXiv Detail & Related papers (2023-01-11T03:05:12Z)
Leveraging EfficientNet and Contrastive Learning for Accurate Global-scale Location Estimation [15.633461635276337]
We propose a mixed classification-retrieval scheme for global-scale image geolocation. Our approach demonstrates very competitive performance on four public datasets.
arXiv Detail & Related papers (2021-05-17T07:18:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.