Related papers: Understanding the Geospatial Reasoning Capabilities of LLMs: A Trajectory Recovery Perspective

Understanding the Geospatial Reasoning Capabilities of LLMs: A Trajectory Recovery Perspective

URL: http://arxiv.org/abs/2510.01639v1
Date: Thu, 02 Oct 2025 03:37:41 GMT
Title: Understanding the Geospatial Reasoning Capabilities of LLMs: A Trajectory Recovery Perspective
Authors: Thinh Hung Truong, Jey Han Lau, Jianzhong Qi,
Abstract summary: This paper explores whether Large Language Models (LLMs) can read road network maps and perform navigation.<n>We frame trajectory recovery as a proxy task, which requires models to reconstruct masked GPS traces.<n>Using road network as context, our prompting framework enables LLMs to generate valid paths without accessing any external navigation tools.
Score: 31.228269455751363
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: We explore the geospatial reasoning capabilities of Large Language Models (LLMs), specifically, whether LLMs can read road network maps and perform navigation. We frame trajectory recovery as a proxy task, which requires models to reconstruct masked GPS traces, and introduce GLOBALTRACE, a dataset with over 4,000 real-world trajectories across diverse regions and transportation modes. Using road network as context, our prompting framework enables LLMs to generate valid paths without accessing any external navigation tools. Experiments show that LLMs outperform off-the-shelf baselines and specialized trajectory recovery models, with strong zero-shot generalization. Fine-grained analysis shows that LLMs have strong comprehension of the road network and coordinate systems, but also pose systematic biases with respect to regions and transportation modes. Finally, we demonstrate how LLMs can enhance navigation experiences by reasoning over maps in flexible ways to incorporate user preferences.

Related papers

Concise Geometric Description as a Bridge: Unleashing the Potential of LLM for Plane Geometry Problem Solving [50.05273675575345]
PlaneThought Problem Solving (PGPS) aims to solve a plane geometric problem based on a geometric diagram and problem textual descriptions.<n>Large Language Models (LLMs) possess strong reasoning skills, their direct application to PGPS is hindered by their inability to process visual diagrams.<n>We train a MLLM Interpreter to generate geometric descriptions for the visual diagram, and an off-the-shelf LLM is utilized to perform reasoning.
arXiv Detail & Related papers (2026-01-29T02:03:33Z)
RoadMind: Towards a Geospatial AI Expert for Disaster Response [6.038207189709353]
Large Language Models (LLMs) have shown impressive performance across a range of natural language tasks, but remain limited in their ability to reason about geospatial data.<n>We present RoadMind, a self-supervised framework that enhances the geospatial reasoning capabilities of LLMs using structured data from OpenStreetMap (OSM)<n>Our results show that models trained via RoadMind significantly outperform strong baselines, including state-of-the-art LLMs equipped with advanced prompt engineering.
arXiv Detail & Related papers (2025-09-18T09:46:55Z)
TrajSceneLLM: A Multimodal Perspective on Semantic GPS Trajectory Analysis [0.0]
We propose TrajSceneLLM, a multimodal perspective for enhancing semantic understanding of GPS trajectories.<n>We validate the proposed framework on Travel Mode Identification (TMI), a critical task for analyzing travel choices and understanding mobility behavior.<n>This semantic enhancement promises significant potential for diverse downstream applications and future research in artificial intelligence.
arXiv Detail & Related papers (2025-06-19T15:31:40Z)
Can LLMs Learn to Map the World from Local Descriptions? [50.490593949836146]
This study investigates whether Large Language Models (LLMs) can construct coherent global spatial cognition.<n> Experiments conducted in a simulated urban environment demonstrate that LLMs exhibit latent representations aligned with real-world spatial distributions.
arXiv Detail & Related papers (2025-05-27T08:22:58Z)
Exploring the Roles of Large Language Models in Reshaping Transportation Systems: A Survey, Framework, and Roadmap [51.198001060683296]
Large Language Models (LLMs) offer transformative potential to address transportation challenges.<n>This survey first presents LLM4TR, a novel conceptual framework that systematically categorizes the roles of LLMs in transportation.<n>For each role, our review spans diverse applications, from traffic prediction and autonomous driving to safety analytics and urban mobility optimization.
arXiv Detail & Related papers (2025-03-27T11:56:27Z)
Navigating Motion Agents in Dynamic and Cluttered Environments through LLM Reasoning [69.5875073447454]
This paper advances motion agents empowered by large language models (LLMs) toward autonomous navigation in dynamic and cluttered environments.<n>Our training-free framework supports multi-agent coordination, closed-loop replanning, and dynamic obstacle avoidance without retraining or fine-tuning.
arXiv Detail & Related papers (2025-03-10T13:39:09Z)
Universal Model Routing for Efficient LLM Inference [69.86195589350264]
Model routing is a technique for reducing the inference cost of large language models (LLMs)<n>We propose UniRoute, a new approach to the problem of dynamic routing, where new, previously unobserved LLMs are available at test time.<n>We show that these are estimates of a theoretically optimal routing rule, and quantify their errors via an excess risk bound.
arXiv Detail & Related papers (2025-02-12T20:30:28Z)
Strada-LLM: Graph LLM for traffic prediction [62.2015839597764]
A considerable challenge in traffic prediction lies in handling the diverse data distributions caused by vastly different traffic conditions.<n>We propose a graph-aware LLM for traffic prediction that considers proximal traffic information.<n>We adopt a lightweight approach for efficient domain adaptation when facing new data distributions in few-shot fashion.
arXiv Detail & Related papers (2024-10-28T09:19:29Z)
Guide-LLM: An Embodied LLM Agent and Text-Based Topological Map for Robotic Guidance of People with Visual Impairments [1.18749525824656]
Guide-LLM is a text-based agent designed to assist persons with visual impairments (PVI) in navigating large indoor environments.<n>Our approach features a novel text-based topological map that enables the LLM to plan global paths.<n>Simulated experiments demonstrate the system's efficacy in guiding PVI, underscoring its potential as a significant advancement in assistive technology.
arXiv Detail & Related papers (2024-10-28T01:58:21Z)
Affordances-Oriented Planning using Foundation Models for Continuous Vision-Language Navigation [64.84996994779443]
We propose a novel Affordances-Oriented Planner for continuous vision-language navigation (VLN) task. Our AO-Planner integrates various foundation models to achieve affordances-oriented low-level motion planning and high-level decision-making. Experiments on the challenging R2R-CE and RxR-CE datasets show that AO-Planner achieves state-of-the-art zero-shot performance.
arXiv Detail & Related papers (2024-07-08T12:52:46Z)
MC-GPT: Empowering Vision-and-Language Navigation with Memory Map and Reasoning Chains [4.941781282578696]
In the Vision-and-Language Navigation (VLN) task, the agent is required to navigate to a destination following a natural language instruction. While learning-based approaches have been a major solution to the task, they suffer from high training costs and lack of interpretability. Recently, Large Language Models (LLMs) have emerged as a promising tool for VLN due to their strong generalization capabilities.
arXiv Detail & Related papers (2024-05-17T08:33:27Z)
NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models [17.495162643127003]
We introduce the NavGPT to reveal the reasoning capability of GPT models in complex embodied scenes. NavGPT takes the textual descriptions of visual observations, navigation history, and future explorable directions as inputs to reason the agent's current status. We show that NavGPT is capable of generating high-quality navigational instructions from observations and actions along a path.
arXiv Detail & Related papers (2023-05-26T14:41:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.