RAG for Geoscience: What We Expect, Gaps and Opportunities
- URL: http://arxiv.org/abs/2508.11246v1
- Date: Fri, 15 Aug 2025 06:33:27 GMT
- Title: RAG for Geoscience: What We Expect, Gaps and Opportunities
- Authors: Runlong Yu, Shiyuan Luo, Rahul Ghosh, Lingyao Li, Yiqun Xie, Xiaowei Jia,
- Abstract summary: Retrieval-Augmented Generation (RAG) enhances language models by combining retrieval with generation.<n>We envision Geo-RAG, a next-generation paradigm that reimagines RAG as a modular retrieve $rightarrow$ reason $rightarrow$ generate $rightarrow$ verify loop.<n>Geo-RAG supports four core capabilities: (i) retrieval of multi-modal Earth data; (ii) reasoning under physical and domain constraints; (iii) generation of science-grade artifacts; and (iv) verification of generated hypotheses against numerical models, ground measurements, and expert assessments.
- Score: 15.069356714106808
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Retrieval-Augmented Generation (RAG) enhances language models by combining retrieval with generation. However, its current workflow remains largely text-centric, limiting its applicability in geoscience. Many geoscientific tasks are inherently evidence-hungry. Typical examples involve imputing missing observations using analog scenes, retrieving equations and parameters to calibrate models, geolocating field photos based on visual cues, or surfacing historical case studies to support policy analyses. A simple ``retrieve-then-generate'' pipeline is insufficient for these needs. We envision Geo-RAG, a next-generation paradigm that reimagines RAG as a modular retrieve $\rightarrow$ reason $\rightarrow$ generate $\rightarrow$ verify loop. Geo-RAG supports four core capabilities: (i) retrieval of multi-modal Earth data; (ii) reasoning under physical and domain constraints; (iii) generation of science-grade artifacts; and (iv) verification of generated hypotheses against numerical models, ground measurements, and expert assessments. This shift opens new opportunities for more trustworthy and transparent geoscience workflows.
Related papers
- Enhancing Geometric Perception in VLMs via Translator-Guided Reinforcement Learning [52.075928878249066]
Vision-guided models (VLMs) often struggle with geometric reasoning due to their limited perception of fundamental diagram elements.<n>We introduce GeoPerceive, a benchmark comprising diagram instances paired with domain-specific language representations.<n>We propose GeoDPO, a translator reinforcement learning framework.
arXiv Detail & Related papers (2026-02-26T07:28:04Z) - GeoAgent: Learning to Geolocate Everywhere with Reinforced Geographic Characteristics [91.17301794848025]
This paper presents GeoAgent, a model capable of reasoning closely with humans and deriving fine-grained address conclusions.<n>Previous RL-based methods have achieved breakthroughs in performance and interpretability but still remain concerns because of their reliance on AI-generated chain-of-thought (CoT) data and training strategies.
arXiv Detail & Related papers (2026-02-13T04:48:05Z) - GeoVista: Web-Augmented Agentic Visual Reasoning for Geolocalization [53.080882980294795]
Current research on agentic visual reasoning enables deep multimodal understanding but primarily focuses on image manipulation tools.<n>In this work, we revisit the geolocalization task, which requires not only nuanced visual grounding but also web search to confirm or refine hypotheses.<n>Since existing geolocalization benchmarks fail to meet the need for high-resolution imagery and the localization challenge for deep agentic reasoning, we curate GeoBench.<n>We propose GeoVista, an agentic model that seamlessly integrates tool invocation within the reasoning loop, including an image-zoom-in tool to magnify regions of interest and a web-search tool to retrieve related
arXiv Detail & Related papers (2025-11-19T18:59:22Z) - GEO-Bench-2: From Performance to Capability, Rethinking Evaluation in Geospatial AI [52.13138825802668]
GeoFMs are transforming Earth Observation, but evaluation lacks standardized protocols.<n> GEO-Bench-2 addresses this with a comprehensive framework spanning classification, segmentation, regression, object detection, and instance segmentation.<n>Code, data, and leaderboard for GEO-Bench-2 are publicly released under a permissive license.
arXiv Detail & Related papers (2025-11-19T17:45:02Z) - GeoBS: Information-Theoretic Quantification of Geographic Bias in AI Models [34.611626290720295]
We establish an information-theoretic framework for geo-bias evaluation, called GeoBS (Geo-Bias Scores)<n>We propose three novel geo-bias scores that explicitly take intricate spatial factors into consideration.
arXiv Detail & Related papers (2025-09-27T20:07:21Z) - GeoAnalystBench: A GeoAI benchmark for assessing large language models for spatial analysis workflow and code generation [32.22754624992446]
We present GeoAnalystBench, a benchmark of 50 Python-based tasks derived from real-world geospatial problems.<n>Using this benchmark, we assess both proprietary and open source models.<n>Results reveal a clear gap: proprietary models such as ChatGPT-4o-mini achieve high 95% validity and stronger code alignment.
arXiv Detail & Related papers (2025-09-07T00:51:57Z) - GRE Suite: Geo-localization Inference via Fine-Tuned Vision-Language Models and Enhanced Reasoning Chains [11.704082783192467]
Geo Reason Enhancement (GRE) Suite is a novel framework that augments Visual Language Models with structured reasoning chains for interpretable location inference.<n>First, we introduce GRE30K, a high-quality geo-localization reasoning dataset designed to facilitate fine-grained visual and contextual analysis.<n>Next, we present the GRE model, which employs a multi-stage reasoning strategy to progressively infer scene attributes, local details, and semantic features, thereby narrowing down potential geographic regions with enhanced precision.
arXiv Detail & Related papers (2025-05-24T13:48:57Z) - OmniGeo: Towards a Multimodal Large Language Models for Geospatial Artificial Intelligence [51.0456395687016]
multimodal large language models (LLMs) have opened new frontiers in artificial intelligence.<n>We propose a MLLM (OmniGeo) tailored to geospatial applications.<n>By combining the strengths of natural language understanding and spatial reasoning, our model enhances the ability of instruction following and the accuracy of GeoAI systems.
arXiv Detail & Related papers (2025-03-20T16:45:48Z) - Geolocation with Real Human Gameplay Data: A Large-Scale Dataset and Human-Like Reasoning Framework [59.42946541163632]
We introduce a comprehensive geolocation framework with three key components.<n>GeoComp, a large-scale dataset; GeoCoT, a novel reasoning method; and GeoEval, an evaluation metric.<n>We demonstrate that GeoCoT significantly boosts geolocation accuracy by up to 25% while enhancing interpretability.
arXiv Detail & Related papers (2025-02-19T14:21:25Z) - GEOBench-VLM: Benchmarking Vision-Language Models for Geospatial Tasks [84.86699025256705]
We present GEOBench-VLM, a benchmark specifically designed to evaluate Vision-Language Models (VLMs) on geospatial tasks.<n>Our benchmark features over 10,000 manually verified instructions and spanning diverse visual conditions, object types, and scales.<n>We evaluate several state-of-the-art VLMs to assess performance on geospatial-specific challenges.
arXiv Detail & Related papers (2024-11-28T18:59:56Z) - Graph Retrieval-Augmented Generation: A Survey [28.979898837538958]
Retrieval-Augmented Generation (RAG) has achieved remarkable success in addressing the challenges of Large Language Models (LLMs) without necessitating retraining.
This paper provides the first comprehensive overview of GraphRAG methodologies.
We formalize the GraphRAG workflow, encompassing Graph-Based Indexing, Graph-Guided Retrieval, and Graph-Enhanced Generation.
arXiv Detail & Related papers (2024-08-15T12:20:24Z) - GeoDTR+: Toward generic cross-view geolocalization via geometric disentanglement [20.346145927174373]
Cross-View Geo-Localization (CVGL) estimates the location of a ground image by matching it to a geo-tagged aerial image in a database.
Existing methods still suffer from poor performance in cross-area evaluation, in which the training and testing data are captured from completely distinct areas.
We attribute this deficiency to the lack of ability to extract the geometric layout of visual features and models' overfitting to low-level details.
In this work, we propose GeoDTR+ with an enhanced GLE module that better models the correlations among visual features.
arXiv Detail & Related papers (2023-08-18T15:32:01Z) - GeoQA: A Geometric Question Answering Benchmark Towards Multimodal
Numerical Reasoning [172.36214872466707]
We focus on solving geometric problems, which requires a comprehensive understanding of textual descriptions, visual diagrams, and theorem knowledge.
We propose a Geometric Question Answering dataset GeoQA, containing 5,010 geometric problems with corresponding annotated programs.
arXiv Detail & Related papers (2021-05-30T12:34:17Z) - Learning Structures in Earth Observation Data with Gaussian Processes [67.27044745471207]
This paper reviews the main theoretical GP developments in the field.
New algorithms that respect the signal and noise characteristics, that provide feature rankings automatically, and that allow applicability of associated uncertainty intervals are discussed.
arXiv Detail & Related papers (2020-12-22T10:46:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.