Mitigating Geospatial Knowledge Hallucination in Large Language Models: Benchmarking and Dynamic Factuality Aligning
- URL: http://arxiv.org/abs/2507.19586v1
- Date: Fri, 25 Jul 2025 18:00:21 GMT
- Title: Mitigating Geospatial Knowledge Hallucination in Large Language Models: Benchmarking and Dynamic Factuality Aligning
- Authors: Shengyuan Wang, Jie Feng, Tianhui Liu, Dan Pei, Yong Li,
- Abstract summary: Large language models (LLMs) possess extensive world knowledge, including geospatial knowledge.<n>LLMs often generate inaccurate geospatial knowledge, leading to geospatial hallucinations.<n>We propose a comprehensive evaluation framework for geospatial hallucinations, leveraging structured geospatial knowledge graphs for controlled assessment.
- Score: 9.759523915137299
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) possess extensive world knowledge, including geospatial knowledge, which has been successfully applied to various geospatial tasks such as mobility prediction and social indicator prediction. However, LLMs often generate inaccurate geospatial knowledge, leading to geospatial hallucinations (incorrect or inconsistent representations of geospatial information) that compromise their reliability. While the phenomenon of general knowledge hallucination in LLMs has been widely studied, the systematic evaluation and mitigation of geospatial hallucinations remain largely unexplored. To address this gap, we propose a comprehensive evaluation framework for geospatial hallucinations, leveraging structured geospatial knowledge graphs for controlled assessment. Through extensive evaluation across 20 advanced LLMs, we uncover the hallucinations in their geospatial knowledge. Building on these insights, we introduce a dynamic factuality aligning method based on Kahneman-Tversky Optimization (KTO) to mitigate geospatial hallucinations in LLMs, leading to a performance improvement of over 29.6% on the proposed benchmark. Extensive experimental results demonstrate the effectiveness of our benchmark and learning algorithm in enhancing the trustworthiness of LLMs in geospatial knowledge and reasoning tasks.
Related papers
- From Pixels to Places: A Systematic Benchmark for Evaluating Image Geolocalization Ability in Large Language Models [14.178064117544082]
Image geolocalization is important for applications in crisis response, digital forensics, and location-based intelligence.<n>Recent advances in large language models (LLMs) offer new opportunities for visual reasoning.<n>We introduce a benchmark called IMAGEO-Bench that systematically evaluates accuracy, distance error, geospatial bias, and reasoning process.
arXiv Detail & Related papers (2025-08-03T06:04:33Z) - OmniGeo: Towards a Multimodal Large Language Models for Geospatial Artificial Intelligence [51.0456395687016]
multimodal large language models (LLMs) have opened new frontiers in artificial intelligence.<n>We propose a MLLM (OmniGeo) tailored to geospatial applications.<n>By combining the strengths of natural language understanding and spatial reasoning, our model enhances the ability of instruction following and the accuracy of GeoAI systems.
arXiv Detail & Related papers (2025-03-20T16:45:48Z) - Swarm Intelligence in Geo-Localization: A Multi-Agent Large Vision-Language Model Collaborative Framework [51.26566634946208]
We introduce smileGeo, a novel visual geo-localization framework.
By inter-agent communication, smileGeo integrates the inherent knowledge of these agents with additional retrieved information.
Results show that our approach significantly outperforms current state-of-the-art methods.
arXiv Detail & Related papers (2024-08-21T03:31:30Z) - PoLLMgraph: Unraveling Hallucinations in Large Language Models via State Transition Dynamics [51.17512229589]
PoLLMgraph is a model-based white-box detection and forecasting approach for large language models.
We show that hallucination can be effectively detected by analyzing the LLM's internal state transition dynamics.
Our work paves a new way for model-based white-box analysis of LLMs, motivating the research community to further explore, understand, and refine the intricate dynamics of LLM behaviors.
arXiv Detail & Related papers (2024-04-06T20:02:20Z) - Knowledge Verification to Nip Hallucination in the Bud [69.79051730580014]
We demonstrate the feasibility of mitigating hallucinations by verifying and minimizing the inconsistency between external knowledge present in the alignment data and the intrinsic knowledge embedded within foundation LLMs.
We propose a novel approach called Knowledge Consistent Alignment (KCA), which employs a well-aligned LLM to automatically formulate assessments based on external knowledge.
We demonstrate the superior efficacy of KCA in reducing hallucinations across six benchmarks, utilizing foundation LLMs of varying backbones and scales.
arXiv Detail & Related papers (2024-01-19T15:39:49Z) - Exploring the Cognitive Knowledge Structure of Large Language Models: An
Educational Diagnostic Assessment Approach [50.125704610228254]
Large Language Models (LLMs) have not only exhibited exceptional performance across various tasks, but also demonstrated sparks of intelligence.
Recent studies have focused on assessing their capabilities on human exams and revealed their impressive competence in different domains.
We conduct an evaluation using MoocRadar, a meticulously annotated human test dataset based on Bloom taxonomy.
arXiv Detail & Related papers (2023-10-12T09:55:45Z) - GeoLLM: Extracting Geospatial Knowledge from Large Language Models [49.20315582673223]
We present GeoLLM, a novel method that can effectively extract geospatial knowledge from large language models.
We demonstrate the utility of our approach across multiple tasks of central interest to the international community, including the measurement of population density and economic livelihoods.
Our experiments reveal that LLMs are remarkably sample-efficient, rich in geospatial information, and robust across the globe.
arXiv Detail & Related papers (2023-10-10T00:03:23Z) - Are Large Language Models Geospatially Knowledgeable? [21.401931052512595]
This paper investigates the extent of geospatial knowledge, awareness, and reasoning abilities encoded within Large Language Models (LLM)
With a focus on autoregressive language models, we devise experimental approaches related to (i) probing LLMs for geo-coordinates to assess geospatial knowledge, (ii) using geospatial and non-geospatial prepositions to gauge their geospatial awareness, and (iii) utilizing a multidimensional scaling (MDS) experiment to assess the models' geospatial reasoning capabilities.
arXiv Detail & Related papers (2023-10-09T17:20:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.