Related papers: GPSBench: Do Large Language Models Understand GPS Coordinates?

GPSBench: Do Large Language Models Understand GPS Coordinates?

URL: http://arxiv.org/abs/2602.16105v1
Date: Wed, 18 Feb 2026 00:33:26 GMT
Title: GPSBench: Do Large Language Models Understand GPS Coordinates?
Authors: Thinh Hung Truong, Jey Han Lau, Jianzhong Qi,
Abstract summary: Large Language Models (LLMs) are increasingly deployed in applications that interact with the physical world, such as navigation, robotics, or mapping.<n>Despite that, LLMs' ability to reason about GPS coordinates and real-world geography remains underexplored.<n>We introduce GPSBench, a dataset of 57,800 samples across 17 tasks for evaluating geospatial reasoning in LLMs.
Score: 31.228269455751363
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Large Language Models (LLMs) are increasingly deployed in applications that interact with the physical world, such as navigation, robotics, or mapping, making robust geospatial reasoning a critical capability. Despite that, LLMs' ability to reason about GPS coordinates and real-world geography remains underexplored. We introduce GPSBench, a dataset of 57,800 samples across 17 tasks for evaluating geospatial reasoning in LLMs, spanning geometric coordinate operations (e.g., distance and bearing computation) and reasoning that integrates coordinates with world knowledge. Focusing on intrinsic model capabilities rather than tool use, we evaluate 14 state-of-the-art LLMs and find that GPS reasoning remains challenging, with substantial variation across tasks: models are generally more reliable at real-world geographic reasoning than at geometric computations. Geographic knowledge degrades hierarchically, with strong country-level performance but weak city-level localization, while robustness to coordinate noise suggests genuine coordinate understanding rather than memorization. We further show that GPS-coordinate augmentation can improve in downstream geospatial tasks, and that finetuning induces trade-offs between gains in geometric computation and degradation in world knowledge. Our dataset and reproducible code are available at https://github.com/joey234/gpsbench

Related papers

Performance and Generalizability Impacts of Incorporating Location Encoders into Deep Learning for Dynamic PM2.5 Estimation [0.39762912548964846]
We study the effect of geolocation into deep learning for a dynamic and spatially heterogeneous application.<n>We estimate daily surface-level PM2.5 across the contiguous United States using satellite and ground-based observations.<n>Results show that raw coordinates can improve performance within regions by supporting spatial generalization, but can reduce generalizability across regions.
arXiv Detail & Related papers (2025-05-24T02:00:34Z)
TrustGeoGen: Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving [106.04001249574786]
TrustGeoGen is a data engine that generates formally verified geometric problems to establish a principled and trustworthy benchmark.<n>Our engine integrates four key innovations: 1) Multimodal Alignment, which synchronizes the generation of diagrams, text, and step-by-step solutions; 2) Formal Verification, ensuring all reasoning paths are rule-compliant; 3) Connection Thinking, bridging formal deduction with human-like logical steps; and 4) our textitGeoExplore series algorithms, which produce diverse problem variants with multiple solutions and self-reflective backtracking.
arXiv Detail & Related papers (2025-04-22T10:45:23Z)
OmniGeo: Towards a Multimodal Large Language Models for Geospatial Artificial Intelligence [51.0456395687016]
multimodal large language models (LLMs) have opened new frontiers in artificial intelligence.<n>We propose a MLLM (OmniGeo) tailored to geospatial applications.<n>By combining the strengths of natural language understanding and spatial reasoning, our model enhances the ability of instruction following and the accuracy of GeoAI systems.
arXiv Detail & Related papers (2025-03-20T16:45:48Z)
Geolocation with Real Human Gameplay Data: A Large-Scale Dataset and Human-Like Reasoning Framework [59.42946541163632]
We introduce a comprehensive geolocation framework with three key components.<n>GeoComp, a large-scale dataset; GeoCoT, a novel reasoning method; and GeoEval, an evaluation metric.<n>We demonstrate that GeoCoT significantly boosts geolocation accuracy by up to 25% while enhancing interpretability.
arXiv Detail & Related papers (2025-02-19T14:21:25Z)
GeoLLM: Extracting Geospatial Knowledge from Large Language Models [49.20315582673223]
We present GeoLLM, a novel method that can effectively extract geospatial knowledge from large language models. We demonstrate the utility of our approach across multiple tasks of central interest to the international community, including the measurement of population density and economic livelihoods. Our experiments reveal that LLMs are remarkably sample-efficient, rich in geospatial information, and robust across the globe.
arXiv Detail & Related papers (2023-10-10T00:03:23Z)
Geo-Encoder: A Chunk-Argument Bi-Encoder Framework for Chinese Geographic Re-Ranking [61.60169764507917]
Chinese geographic re-ranking task aims to find the most relevant addresses among retrieved candidates. We propose an innovative framework, namely Geo-Encoder, to more effectively integrate Chinese geographical semantics into re-ranking pipelines.
arXiv Detail & Related papers (2023-09-04T13:44:50Z)
GeoGLUE: A GeoGraphic Language Understanding Evaluation Benchmark [56.08664336835741]
We propose a GeoGraphic Language Understanding Evaluation benchmark, named GeoGLUE. We collect data from open-released geographic resources and introduce six natural language understanding tasks. We pro vide evaluation experiments and analysis of general baselines, indicating the effectiveness and significance of the GeoGLUE benchmark.
arXiv Detail & Related papers (2023-05-11T03:21:56Z)
A Multi-Modal Neural Geometric Solver with Textual Clauses Parsed from Diagram [33.62866585222121]
We propose a new neural solver called PGPSNet to fuse multi-modal information efficiently. PGPSNet is endowed with rich knowledge of geometry theorems and geometric representation. We build a new large-scale and fine-annotated GPS dataset named PGPS9K.
arXiv Detail & Related papers (2023-02-22T02:38:25Z)
A Gis Aided Approach for Geolocalizing an Unmanned Aerial System Using Deep Learning [0.4297070083645048]
We propose an alternative approach to geolocalize a UAS when GPS signal is degraded or denied. Considering UAS has a downward-looking camera on its platform that can acquire real-time images as the platform flies, we apply modern deep learning techniques to achieve geolocalization. We extract GIS information from OpenStreetMap (OSM) to semantically segment matched features into building and terrain classes.
arXiv Detail & Related papers (2022-08-25T17:51:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.