Related papers: A Hierarchical Multi-Agent System for Autonomous Discovery in Geoscientific Data Archives

A Hierarchical Multi-Agent System for Autonomous Discovery in Geoscientific Data Archives

URL: http://arxiv.org/abs/2602.21351v1
Date: Tue, 24 Feb 2026 20:37:38 GMT
Title: A Hierarchical Multi-Agent System for Autonomous Discovery in Geoscientific Data Archives
Authors: Dmitrii Pantiukhin, Ivan Kuznetsov, Boris Shapkin, Antonia Anna Jost, Thomas Jung, Nikolay Koldunov,
Abstract summary: PANGAEA-GPT is a hierarchical multi-agent framework designed for autonomous data discovery and analysis.<n>Unlike standard Large Language Model (LLM) wrappers, our architecture implements a centralized Supervisor-Worker topology.<n>We demonstrate the system's capacity to execute complex, multi-step deterministic runtime with minimal human intervention.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The rapid accumulation of Earth science data has created a significant scalability challenge; while repositories like PANGAEA host vast collections of datasets, citation metrics indicate that a substantial portion remains underutilized, limiting data reusability. Here we present PANGAEA-GPT, a hierarchical multi-agent framework designed for autonomous data discovery and analysis. Unlike standard Large Language Model (LLM) wrappers, our architecture implements a centralized Supervisor-Worker topology with strict data-type-aware routing, sandboxed deterministic code execution, and self-correction via execution feedback, enabling agents to diagnose and resolve runtime errors. Through use-case scenarios spanning physical oceanography and ecology, we demonstrate the system's capacity to execute complex, multi-step workflows with minimal human intervention. This framework provides a methodology for querying and analyzing heterogeneous repository data through coordinated agent workflows.

Related papers

AgentSkiller: Scaling Generalist Agent Intelligence through Semantically Integrated Cross-Domain Data Synthesis [30.512393568258105]
Large Language Model agents demonstrate potential in solving real-world problems via tools, yet generalist intelligence is bottlenecked by scarce high-quality, long-horizon data.<n>We propose AgentSkiller, a fully automated framework synthesizing multi-turn interaction data across realistic, semantically linked domains.
arXiv Detail & Related papers (2026-02-10T03:21:42Z)
OFA-MAS: One-for-All Multi-Agent System Topology Design based on Mixture-of-Experts Graph Generative Models [57.94189874119267]
Multi-Agent Systems (MAS) offer a powerful paradigm for solving complex problems.<n>Current graph learning-based design methodologies often adhere to a "one-for-one" paradigm.<n>We propose OFA-TAD, a one-for-all framework that generates adaptive collaboration graphs for any task described in natural language.
arXiv Detail & Related papers (2026-01-19T12:23:44Z)
Designing Domain-Specific Agents via Hierarchical Task Abstraction Mechanism [61.01709143437043]
We introduce a novel agent design framework centered on a Hierarchical Task Abstraction Mechanism (HTAM)<n>Specifically, HTAM moves beyond emulating social roles, instead structuring multi-agent systems into a logical hierarchy that mirrors the intrinsic task-dependency graph of a given domain.<n>We instantiate this framework as EarthAgent, a multi-agent system tailored for complex geospatial analysis.
arXiv Detail & Related papers (2025-11-21T12:25:47Z)
InferA: A Smart Assistant for Cosmological Ensemble Data [0.5130440339897478]
InferA is a multi-agent system that enables scalable and efficient scientific data analysis.<n>At the core of the architecture is a supervisor agent that orchestrates a team of specialized agents responsible for distinct phases of the data retrieval and analysis.<n>To demonstrate the framework's usability, we evaluate the system using ensemble runs from the HACC cosmology simulation which comprises several terabytes.
arXiv Detail & Related papers (2025-10-14T18:47:22Z)
Dynamic Generation of Multi-LLM Agents Communication Topologies with Graph Diffusion Models [99.85131798240808]
We introduce a novel generative framework called textitGuided Topology Diffusion (GTD)<n>Inspired by conditional discrete graph diffusion models, GTD formulates topology synthesis as an iterative construction process.<n>At each step, the generation is steered by a lightweight proxy model that predicts multi-objective rewards.<n>Experiments show that GTD can generate highly task-adaptive, sparse, and efficient communication topologies.
arXiv Detail & Related papers (2025-10-09T05:28:28Z)
CoDA: Agentic Systems for Collaborative Data Visualization [57.270599188947294]
Deep research has revolutionized data analysis, yet data scientists still devote substantial time to manually crafting visualizations.<n>Existing approaches, including simple single- or multi-agent systems, often oversimplify the task.<n>We introduce CoDA, a multi-agent system that employs specialized LLM agents for metadata analysis, task planning, code generation, and self-reflection.
arXiv Detail & Related papers (2025-10-03T17:30:16Z)
Leveraging Knowledge Graphs and LLM Reasoning to Identify Operational Bottlenecks for Warehouse Planning Assistance [1.2749527861829046]
Our framework integrates Knowledge Graphs (KGs) and Large Language Model (LLM)-based agents.<n>It transforms raw DES data into a semantically rich KG, capturing relationships between simulation events and entities.<n>An LLM-based agent uses iterative reasoning, generating interdependent sub-questions. For each sub-question, it creates Cypher queries for KG interaction, extracts information, and self-reflects to correct errors.
arXiv Detail & Related papers (2025-07-23T07:18:55Z)
Deep Research Agents: A Systematic Examination And Roadmap [109.53237992384872]
Deep Research (DR) agents are designed to tackle complex, multi-turn informational research tasks.<n>In this paper, we conduct a detailed analysis of the foundational technologies and architectural components that constitute DR agents.
arXiv Detail & Related papers (2025-06-22T16:52:48Z)
HM-RAG: Hierarchical Multi-Agent Multimodal Retrieval Augmented Generation [11.53083922927901]
HM-RAG is a novel Hierarchical Multi-agent Multimodal RAG framework.<n>It pioneers collaborative intelligence for dynamic knowledge synthesis across structured, unstructured, and graph-based data.
arXiv Detail & Related papers (2025-04-13T06:55:33Z)
Simplifying Data Integration: SLM-Driven Systems for Unified Semantic Queries Across Heterogeneous Databases [0.0]
This paper presents a Small Language Model(SLM)-driven system that synergizes advancements in lightweight Retrieval-Augmented Generation (RAG) and semantic-aware data structuring.<n>By integrating MiniRAG's semantic-aware heterogeneous graph indexing and topology-enhanced retrieval with SLM-powered structured data extraction, our system addresses the limitations of traditional methods.<n> Experimental results demonstrate superior performance in accuracy and efficiency, while the introduction of semantic entropy as an unsupervised evaluation metric provides robust insights into model uncertainty.
arXiv Detail & Related papers (2025-04-08T03:28:03Z)
DiscoveryBench: Towards Data-Driven Discovery with Large Language Models [50.36636396660163]
We present DiscoveryBench, the first comprehensive benchmark that formalizes the multi-step process of data-driven discovery. Our benchmark contains 264 tasks collected across 6 diverse domains, such as sociology and engineering. Our benchmark, thus, illustrates the challenges in autonomous data-driven discovery and serves as a valuable resource for the community to make progress.
arXiv Detail & Related papers (2024-07-01T18:58:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.