Related papers: Earth-Agent: Unlocking the Full Landscape of Earth Observation with Agents

Earth-Agent: Unlocking the Full Landscape of Earth Observation with Agents

URL: http://arxiv.org/abs/2509.23141v2
Date: Thu, 16 Oct 2025 07:27:45 GMT
Title: Earth-Agent: Unlocking the Full Landscape of Earth Observation with Agents
Authors: Peilin Feng, Zhutao Lv, Junyan Ye, Xiaolei Wang, Xinjie Huo, Jinhua Yu, Wanghan Xu, Wenlong Zhang, Lei Bai, Conghui He, Weijia Li,
Abstract summary: Earth observation is essential for understanding the states of the Earth system.<n>Recent MLLMs have advanced EO research, but they still lack the capability to tackle complex tasks that require multi-step reasoning.<n>We introduce Earth-Agent, the first agentic framework that unifies RGB and spectral EO data within an MCP-based tool ecosystem.
Score: 49.3216026940601
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Earth observation (EO) is essential for understanding the evolving states of the Earth system. Although recent MLLMs have advanced EO research, they still lack the capability to tackle complex tasks that require multi-step reasoning and the use of domain-specific tools. Agent-based methods offer a promising direction, but current attempts remain in their infancy, confined to RGB perception, shallow reasoning, and lacking systematic evaluation protocols. To overcome these limitations, we introduce Earth-Agent, the first agentic framework that unifies RGB and spectral EO data within an MCP-based tool ecosystem, enabling cross-modal, multi-step, and quantitative spatiotemporal reasoning beyond pretrained MLLMs. Earth-Agent supports complex scientific tasks such as geophysical parameter retrieval and quantitative spatiotemporal analysis by dynamically invoking expert tools and models across modalities. To support comprehensive evaluation, we further propose Earth-Bench, a benchmark of 248 expert-curated tasks with 13,729 images, spanning spectrum, products and RGB modalities, and equipped with a dual-level evaluation protocol that assesses both reasoning trajectories and final outcomes. We conduct comprehensive experiments varying different LLM backbones, comparisons with general agent frameworks, and comparisons with MLLMs on remote sensing benchmarks, demonstrating both the effectiveness and potential of Earth-Agent. Earth-Agent establishes a new paradigm for EO analysis, moving the field toward scientifically grounded, next-generation applications of LLMs in Earth observation.

Related papers

Opportunities in AI/ML for the Rubin LSST Dark Energy Science Collaboration [63.61423859450929]
This white paper surveys the current landscape of AI/ML across DESC's primary cosmological probes and cross-cutting analyses.<n>We identify key methodological research priorities, including Bayesian inference at scale, physics-informed methods, validation frameworks, and active learning for discovery.
arXiv Detail & Related papers (2026-01-20T18:46:42Z)
What Do LLM Agents Know About Their World? Task2Quiz: A Paradigm for Studying Environment Understanding [50.35012849818872]
Large language model (LLM) agents have demonstrated remarkable capabilities in complex decision-making and tool-use tasks.<n>We propose Task-to-Quiz (T2Q), a deterministic and automated evaluation paradigm designed to decouple task execution from world-state understanding.<n>Our experiments reveal that task success is often a poor proxy for environment understanding, and that current memory machanism can not effectively help agents acquire a grounded model of the environment.
arXiv Detail & Related papers (2026-01-14T14:09:11Z)
Multi-Agent Reinforcement Learning for Heterogeneous Satellite Cluster Resources Optimization [19.16014340215772]
Two optical satellites and one SAR satellite operate cooperatively in low Earth orbit to capture ground targets and manage their limited onboard resources efficiently.<n>Traditional optimization methods struggle to handle the real-time, uncertain, and decentralized nature of Earth Observation (EO) operations.<n>This study systematically formulates the optimization problem from single-satellite to multi-satellite scenarios.<n>Using a near-realistic simulation environment built on the Basilisk and BSK-RL frameworks, we evaluate the performance and stability of state-of-the-art MARL algorithms.
arXiv Detail & Related papers (2025-11-16T21:47:04Z)
Can Agents Judge Systematic Reviews Like Humans? Evaluating SLRs with LLM-based Multi-Agent System [1.3052252174353483]
Systematic Literature Reviews ( SLRs) are foundational to evidence-based research but remain labor-intensive and prone to inconsistency across disciplines.<n>We present an LLM-based SLR evaluation copilot built on a Multi-Agent System (MAS) architecture to assist researchers in assessing the overall quality of the systematic literature reviews.<n>Unlike conventional single-agent methods, our design integrates a specialized agentic approach aligned with PRISMA guidelines to support more structured and interpretable evaluations.
arXiv Detail & Related papers (2025-09-21T21:17:23Z)
Multi-Agent Reinforcement Learning for Autonomous Multi-Satellite Earth Observation: A Realistic Case Study [9.798174763420896]
The exponential growth of Low Earth Orbit (LEO) satellites has revolutionised Earth Observation (EO) missions.<n>Traditional optimisation approaches struggle to handle the real-time decision-making demands of dynamic EO missions.<n>We investigate RL-based autonomous EO mission planning by modelling single-satellite operations and extending to multi-satellite constellations.
arXiv Detail & Related papers (2025-06-18T07:42:11Z)
EarthMind: Leveraging Cross-Sensor Data for Advanced Earth Observation Interpretation with a Unified Multimodal LLM [103.7537991413311]
Earth Observation (EO) data analysis is vital for monitoring environmental and human dynamics.<n>Recent Multimodal Large Language Models (MLLMs) show potential in EO understanding but remain restricted to single-sensor inputs.<n>We propose EarthMind, a unified vision-language framework that handles both single- and cross-sensor inputs.
arXiv Detail & Related papers (2025-06-02T13:36:05Z)
ThinkGeo: Evaluating Tool-Augmented Agents for Remote Sensing Tasks [54.52092001110694]
ThinkGeo is a benchmark designed to evaluate tool-augmented agents on remote sensing tasks via structured tool use and multi-step planning.<n>Inspired by tool-interaction paradigms, ThinkGeo includes human-curated queries spanning a wide range of real-world applications.<n>Our analysis reveals notable disparities in tool accuracy and planning consistency across models.
arXiv Detail & Related papers (2025-05-29T17:59:38Z)
EMMOE: A Comprehensive Benchmark for Embodied Mobile Manipulation in Open Environments [11.97783742296183]
Embodied Mobile Manipulation in Open Environments is a benchmark that requires agents to interpret user instructions and execute long-horizon everyday tasks in continuous space.<n>Embodied Mobile Manipulation in Open Environments seamlessly integrates high-level and low-level embodied tasks into a unified framework, along with three new metrics for more diverse assessment.<n>We designmodel, a sophisticated agent system consists of LLM with Direct Preference Optimization (DPO), light weighted navigation and manipulation models, and multiple error detection mechanisms.
arXiv Detail & Related papers (2025-03-11T16:42:36Z)
Survey on AI-Generated Media Detection: From Non-MLLM to MLLM [51.91311158085973]
Methods for detecting AI-generated media have evolved rapidly.<n>General-purpose detectors based on MLLMs integrate authenticity verification, explainability, and localization capabilities.<n>Ethical and security considerations have emerged as critical global concerns.
arXiv Detail & Related papers (2025-02-07T12:18:20Z)
EarthDial: Turning Multi-sensory Earth Observations to Interactive Dialogues [46.601134018876955]
We introduce EarthDial, a conversational assistant specifically designed for Earth Observation (EO) data.<n>EarthDial supports multi-spectral, multi-temporal, and multi-resolution imagery, enabling a wide range of remote sensing tasks.<n>Our experimental results on 44 downstream datasets demonstrate that EarthDial outperforms existing generic and domain-specific models.
arXiv Detail & Related papers (2024-12-19T18:57:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.