Related papers: MapAgent: A Hierarchical Agent for Geospatial Reasoning with Dynamic Map Tool Integration

MapAgent: A Hierarchical Agent for Geospatial Reasoning with Dynamic Map Tool Integration

URL: http://arxiv.org/abs/2509.05933v2
Date: Tue, 14 Oct 2025 04:15:07 GMT
Title: MapAgent: A Hierarchical Agent for Geospatial Reasoning with Dynamic Map Tool Integration
Authors: Md Hasebul Hasan, Mahir Labib Dihan, Tanzima Hashem, Mohammed Eunus Ali, Md Rizwan Parvez,
Abstract summary: MapAgent is a hierarchical multi-agent plug-and-play framework with customized toolsets and agentic scaffolds for map-integrated geospatial reasoning.<n>A high-level planner decomposes complex queries into subgoals, which are routed to specialized modules.<n>For tool-heavy modules-such as map-based services-we then design a dedicated map-tool agent that efficiently orchestrates related APIs.<n>This hierarchical design reduces cognitive load, improves tool selection accuracy, and enables precise coordination across similar APIs.
Score: 10.665735568655828
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Agentic AI has significantly extended the capabilities of large language models (LLMs) by enabling complex reasoning and tool use. However, most existing frameworks are tailored to domains such as mathematics, coding, or web automation, and fall short on geospatial tasks that require spatial reasoning, multi-hop planning, and real-time map interaction. To address these challenges, we introduce MapAgent, a hierarchical multi-agent plug-and-play framework with customized toolsets and agentic scaffolds for map-integrated geospatial reasoning. Unlike existing flat agent-based approaches that treat tools uniformly-often overwhelming the LLM when handling similar but subtly different geospatial APIs-MapAgent decouples planning from execution. A high-level planner decomposes complex queries into subgoals, which are routed to specialized modules. For tool-heavy modules-such as map-based services-we then design a dedicated map-tool agent that efficiently orchestrates related APIs adaptively in parallel to effectively fetch geospatial data relevant for the query, while simpler modules (e.g., solution generation or answer extraction) operate without additional agent overhead. This hierarchical design reduces cognitive load, improves tool selection accuracy, and enables precise coordination across similar APIs. We evaluate MapAgent on four diverse geospatial benchmarks-MapEval-Textual, MapEval-API, MapEval-Visual, and MapQA-and demonstrate substantial gains over state-of-the-art tool-augmented and agentic baselines. We open-source our framwork at https://github.com/Hasebul/MapAgent.

Related papers

OFA-MAS: One-for-All Multi-Agent System Topology Design based on Mixture-of-Experts Graph Generative Models [57.94189874119267]
Multi-Agent Systems (MAS) offer a powerful paradigm for solving complex problems.<n>Current graph learning-based design methodologies often adhere to a "one-for-one" paradigm.<n>We propose OFA-TAD, a one-for-all framework that generates adaptive collaboration graphs for any task described in natural language.
arXiv Detail & Related papers (2026-01-19T12:23:44Z)
Designing Domain-Specific Agents via Hierarchical Task Abstraction Mechanism [61.01709143437043]
We introduce a novel agent design framework centered on a Hierarchical Task Abstraction Mechanism (HTAM)<n>Specifically, HTAM moves beyond emulating social roles, instead structuring multi-agent systems into a logical hierarchy that mirrors the intrinsic task-dependency graph of a given domain.<n>We instantiate this framework as EarthAgent, a multi-agent system tailored for complex geospatial analysis.
arXiv Detail & Related papers (2025-11-21T12:25:47Z)
DeepAgent: A General Reasoning Agent with Scalable Toolsets [111.6384541877723]
DeepAgent is an end-to-end deep reasoning agent that performs autonomous thinking, tool discovery, and action execution.<n>To address the challenges of long-horizon interactions, we introduce an autonomous memory folding mechanism that compresses past interactions into structured episodic, working, and tool memories.<n>We develop an end-to-end reinforcement learning strategy, namely ToolPO, that leverages LLM-simulated APIs and applies tool-call advantage attribution to assign fine-grained credit to the tool invocation tokens.
arXiv Detail & Related papers (2025-10-24T16:24:01Z)
GeoJSON Agents:A Multi-Agent LLM Architecture for Geospatial Analysis-Function Calling vs Code Generation [7.335354895959486]
This study is the first to introduce an LLM multi-agent framework for GeoJSON data.<n>The architecture consists of three components-task parsing, agent collaboration, and result integration.
arXiv Detail & Related papers (2025-09-10T03:43:46Z)
AgentSwift: Efficient LLM Agent Design via Value-guided Hierarchical Search [58.98450205734779]
Large language model (LLM) agents have demonstrated strong capabilities across diverse domains.<n>Existing agent search methods suffer from three major limitations.<n>We introduce a comprehensive framework to address these challenges.
arXiv Detail & Related papers (2025-06-06T12:07:23Z)
ThinkGeo: Evaluating Tool-Augmented Agents for Remote Sensing Tasks [54.52092001110694]
ThinkGeo is a benchmark designed to evaluate tool-augmented agents on remote sensing tasks via structured tool use and multi-step planning.<n>Inspired by tool-interaction paradigms, ThinkGeo includes human-curated queries spanning a wide range of real-world applications.<n>Our analysis reveals notable disparities in tool accuracy and planning consistency across models.
arXiv Detail & Related papers (2025-05-29T17:59:38Z)
IC-Mapper: Instance-Centric Spatio-Temporal Modeling for Online Vectorized Map Construction [18.975185033472968]
IC-Mapper is an instance-centric online mapping framework, which comprises two primary components.<n>We perform point sampling on the historical global map from a spatial dimension and integrate it with the detection results of instances corresponding to the current frame to achieve real-time expansion and update of the map.
arXiv Detail & Related papers (2025-03-05T20:28:34Z)
MapQaTor: An Extensible Framework for Efficient Annotation of Map-Based QA Datasets [3.3856216159724983]
We introduce MapQaTor, an open-source framework that streamlines the creation of traceable map-based QA datasets.<n>MapQaTor enables seamless integration with any maps API, allowing users to gather and visualize data from diverse sources.
arXiv Detail & Related papers (2024-12-30T15:33:19Z)
MGMapNet: Multi-Granularity Representation Learning for End-to-End Vectorized HD Map Construction [75.93907511203317]
We propose MGMapNet (Multi-Granularity Map Network) to model map element with a multi-granularity representation. The proposed MGMapNet achieves state-of-the-art performance, surpassing MapTRv2 by 5.3 mAP on nuScenes and 4.4 mAP on Argoverse2 respectively.
arXiv Detail & Related papers (2024-10-10T09:05:23Z)
Leveraging Enhanced Queries of Point Sets for Vectorized Map Construction [15.324464723174533]
This paper introduces MapQR, an end-to-end method with an emphasis on enhancing query capabilities for constructing online vectorized maps. MapQR utilizes a novel query design, called scatter-and-gather query, which is modelled by separate content and position parts explicitly. The proposed MapQR achieves the best mean average precision (mAP) and maintains good efficiency on both nuScenes and Argoverse 2.
arXiv Detail & Related papers (2024-02-27T11:43:09Z)
MGeo: Multi-Modal Geographic Pre-Training Method [49.78466122982627]
We propose a novel query-POI matching method Multi-modal Geographic language model (MGeo) MGeo represents GC as a new modality and is able to fully extract multi-modal correlations for accurate query-POI matching. Our proposed multi-modal pre-training method can significantly improve the query-POI matching capability of generic PTMs.
arXiv Detail & Related papers (2023-01-11T03:05:12Z)
GoRela: Go Relative for Viewpoint-Invariant Motion Forecasting [121.42898228997538]
We propose an efficient shared encoding for all agents and the map without sacrificing accuracy or generalization. We leverage pair-wise relative positional encodings to represent geometric relationships between the agents and the map elements in a heterogeneous spatial graph. Our decoder is also viewpoint agnostic, predicting agent goals on the lane graph to enable diverse and context-aware multimodal prediction.
arXiv Detail & Related papers (2022-11-04T16:10:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.