Fugu-MT 論文翻訳(概要): GPSBench: Do Large Language Models Understand GPS Coordinates?

論文の概要: GPSBench: Do Large Language Models Understand GPS Coordinates?

arxiv url: http://arxiv.org/abs/2602.16105v1
Date: Wed, 18 Feb 2026 00:33:26 GMT
ステータス: 翻訳完了
システム内更新日: 2026-02-19 15:58:30.476165
Title: GPSBench: Do Large Language Models Understand GPS Coordinates?
Title（参考訳）: GPSBench: 大規模言語モデルはGPSのコーディネートを理解するか?
Authors: Thinh Hung Truong, Jey Han Lau, Jianzhong Qi,
Abstract要約: 大きな言語モデル(LLM)は、ナビゲーション、ロボット工学、マッピングといった物理的な世界と相互作用するアプリケーションにますます多くデプロイされている。それにもかかわらず、LLMがGPS座標と現実世界の地理を推論する能力はいまだに未調査である。我々は,LLMにおける地理空間的推論を評価するために,17タスクにわたる57,800個のサンプルのデータセットであるGPSBenchを紹介する。
参考スコア（独自算出の注目度）: 31.228269455751363
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Large Language Models (LLMs) are increasingly deployed in applications that interact with the physical world, such as navigation, robotics, or mapping, making robust geospatial reasoning a critical capability. Despite that, LLMs' ability to reason about GPS coordinates and real-world geography remains underexplored. We introduce GPSBench, a dataset of 57,800 samples across 17 tasks for evaluating geospatial reasoning in LLMs, spanning geometric coordinate operations (e.g., distance and bearing computation) and reasoning that integrates coordinates with world knowledge. Focusing on intrinsic model capabilities rather than tool use, we evaluate 14 state-of-the-art LLMs and find that GPS reasoning remains challenging, with substantial variation across tasks: models are generally more reliable at real-world geographic reasoning than at geometric computations. Geographic knowledge degrades hierarchically, with strong country-level performance but weak city-level localization, while robustness to coordinate noise suggests genuine coordinate understanding rather than memorization. We further show that GPS-coordinate augmentation can improve in downstream geospatial tasks, and that finetuning induces trade-offs between gains in geometric computation and degradation in world knowledge. Our dataset and reproducible code are available at https://github.com/joey234/gpsbench
Abstract（参考訳）: 大きな言語モデル(LLM)は、ナビゲーション、ロボット工学、マッピングといった物理世界と相互作用するアプリケーションにますます多くデプロイされており、地理空間的推論が重要な機能である。それにもかかわらず、LLMがGPS座標と現実世界の地理を推論する能力はいまだに未調査である。 GPSBenchは17のタスクにまたがる57,800個のサンプルのデータセットで,LLMの地理空間的推論,幾何座標演算(例えば距離と軸受計算),および座標を世界知識と統合する推論を行う。ツールの使用よりも本質的なモデル機能に注目し,14の最先端のLCMを評価し,GPS推論が課題であり,タスクによって大きく異なることを見出した。地理的知識は階層的に低下し、国レベルのパフォーマンスは強いが、都市レベルのローカライゼーションは弱いが、騒音のコーディネート性は記憶よりも真のコーディネート理解を示唆している。さらに,GPS座標の増大により下流の地理空間的タスクが向上し,微調整によって幾何計算の利得と世界知識の劣化のトレードオフが引き起こされることを示す。私たちのデータセットと再現可能なコードはhttps://github.com/joey234/gpsbenchで利用可能です。

論文の概要: GPSBench: Do Large Language Models Understand GPS Coordinates?

関連論文リスト