Related papers: GeoJSON Agents:A Multi-Agent LLM Architecture for Geospatial Analysis-Function Calling vs Code Generation

GeoJSON Agents:A Multi-Agent LLM Architecture for Geospatial Analysis-Function Calling vs Code Generation

URL: http://arxiv.org/abs/2509.08863v2
Date: Fri, 12 Sep 2025 08:26:37 GMT
Title: GeoJSON Agents:A Multi-Agent LLM Architecture for Geospatial Analysis-Function Calling vs Code Generation
Authors: Qianqian Luo, Liuchang Xu, Qingming Lin, Sensen Wu, Ruichen Mao, Chao Wang, Hailin Feng, Bo Huang, Zhenhong Du,
Abstract summary: This study is the first to introduce an LLM multi-agent framework for GeoJSON data.<n>The architecture consists of three components-task parsing, agent collaboration, and result integration.
Score: 7.335354895959486
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: LLMs have made substantial progress in task automation and natural language understanding. However, without expertise in GIS, they continue to encounter limitations. To address these issues, we propose GeoJSON Agents-a multi-agent LLM architecture. This framework transforms natural language tasks into structured GeoJSON operation commands and processes spatial data using two widely adopted LLM enhancement techniques: Function Calling and Code Generation. The architecture consists of three components-task parsing, agent collaboration, and result integration-aimed at enhancing both the performance and scalability of GIS automation. The Planner agent interprets natural language tasks into structured GeoJSON commands. Then, specialized Worker agents collaborate according to assigned roles to perform spatial data processing and analysis, either by invoking predefined function APIs or by dynamically generating and executing Python-based spatial analysis code. Finally, the system integrates the outputs from multiple execution rounds into reusable, standards-compliant GeoJSON files. To systematically evaluate the performance of the two approaches, we constructed a benchmark dataset of 70 tasks with varying complexity and conducted experiments using OpenAI's GPT-4o as the core model. Results indicate that the Function Calling-based GeoJSON Agent achieved an accuracy of 85.71%, while the Code Generation-based agent reached 97.14%, both significantly outperforming the best-performing general-purpose model (48.57%). Further analysis reveals that the Code Generation provides greater flexibility, whereas the Function Calling approach offers more stable execution. This study is the first to introduce an LLM multi-agent framework for GeoJSON data and to compare the strengths and limitations of two mainstream LLM enhancement methods, offering new perspectives for improving GeoAI system performance.

Related papers

Enhancing Geometric Perception in VLMs via Translator-Guided Reinforcement Learning [52.075928878249066]
Vision-guided models (VLMs) often struggle with geometric reasoning due to their limited perception of fundamental diagram elements.<n>We introduce GeoPerceive, a benchmark comprising diagram instances paired with domain-specific language representations.<n>We propose GeoDPO, a translator reinforcement learning framework.
arXiv Detail & Related papers (2026-02-26T07:28:04Z)
OFA-MAS: One-for-All Multi-Agent System Topology Design based on Mixture-of-Experts Graph Generative Models [57.94189874119267]
Multi-Agent Systems (MAS) offer a powerful paradigm for solving complex problems.<n>Current graph learning-based design methodologies often adhere to a "one-for-one" paradigm.<n>We propose OFA-TAD, a one-for-all framework that generates adaptive collaboration graphs for any task described in natural language.
arXiv Detail & Related papers (2026-01-19T12:23:44Z)
GeoSQL-Eval: First Evaluation of LLMs on PostGIS-Based NL2GeoSQL Queries [12.523407991161315]
We present Geo-Eval, the first end-to-end automated evaluation framework for PostGIS generation.<n>We also release a public Geo-Eval leaderboard platform for continuous testing and global comparison.
arXiv Detail & Related papers (2025-09-28T04:50:48Z)
GeoAnalystBench: A GeoAI benchmark for assessing large language models for spatial analysis workflow and code generation [32.22754624992446]
We present GeoAnalystBench, a benchmark of 50 Python-based tasks derived from real-world geospatial problems.<n>Using this benchmark, we assess both proprietary and open source models.<n>Results reveal a clear gap: proprietary models such as ChatGPT-4o-mini achieve high 95% validity and stronger code alignment.
arXiv Detail & Related papers (2025-09-07T00:51:57Z)
GeoJSEval: An Automated Evaluation Framework for Large Language Models on JavaScript-Based Geospatial Computation and Visualization Code Generation [8.019960494784039]
GeoJSEval is a multimodal, function-level automatic evaluation framework for LLMs in JavaScript-based code generation.<n>It includes 432 function-level tasks and 2,071 structured test cases spanning five widely used JavaScript geospatial libraries and 25 mainstream geospatial data types.<n>We conduct a comprehensive evaluation of 18 state-of-the-art LLMs using GeoJSEval, revealing significant performance disparities and bottlenecks in spatial semantic understanding, code reliability, and function invocation accuracy.
arXiv Detail & Related papers (2025-07-28T06:38:38Z)
AgentSwift: Efficient LLM Agent Design via Value-guided Hierarchical Search [58.98450205734779]
Large language model (LLM) agents have demonstrated strong capabilities across diverse domains.<n>Existing agent search methods suffer from three major limitations.<n>We introduce a comprehensive framework to address these challenges.
arXiv Detail & Related papers (2025-06-06T12:07:23Z)
ThinkGeo: Evaluating Tool-Augmented Agents for Remote Sensing Tasks [54.52092001110694]
ThinkGeo is a benchmark designed to evaluate tool-augmented agents on remote sensing tasks via structured tool use and multi-step planning.<n>Inspired by tool-interaction paradigms, ThinkGeo includes human-curated queries spanning a wide range of real-world applications.<n>Our analysis reveals notable disparities in tool accuracy and planning consistency across models.
arXiv Detail & Related papers (2025-05-29T17:59:38Z)
Enhancing the Geometric Problem-Solving Ability of Multimodal LLMs via Symbolic-Neural Integration [57.95306827012784]
We propose GeoGen, a pipeline that can automatically generate step-wise reasoning paths for geometry diagrams.<n>By leveraging the precise symbolic reasoning, textbfGeoGen produces large-scale, high-quality question-answer pairs.<n>We train textbfGeoLogic, a Large Language Model (LLM), using synthetic data generated by GeoGen.
arXiv Detail & Related papers (2025-04-17T09:13:46Z)
Geo-FuB: A Method for Constructing an Operator-Function Knowledge Base for Geospatial Code Generation Tasks Using Large Language Models [0.5242869847419834]
This study introduces a framework to construct such a knowledge base, leveraging geospatial script semantics. An example knowledge base, Geo-FuB, built from 154,075 Google Earth Engine scripts, is available on GitHub.
arXiv Detail & Related papers (2024-10-28T12:50:27Z)
An LLM Agent for Automatic Geospatial Data Analysis [5.842462214442362]
Large language models (LLMs) are being used in data science code generation tasks. Their application to geospatial data processing is challenging due to difficulties in incorporating complex data structures and spatial constraints. We introduce GeoAgent, a new interactive framework designed to help LLMs handle geospatial data processing more effectively.
arXiv Detail & Related papers (2024-10-24T14:47:25Z)
GeoLLM: Extracting Geospatial Knowledge from Large Language Models [49.20315582673223]
We present GeoLLM, a novel method that can effectively extract geospatial knowledge from large language models. We demonstrate the utility of our approach across multiple tasks of central interest to the international community, including the measurement of population density and economic livelihoods. Our experiments reveal that LLMs are remarkably sample-efficient, rich in geospatial information, and robust across the globe.
arXiv Detail & Related papers (2023-10-10T00:03:23Z)
MLAgentBench: Evaluating Language Agents on Machine Learning Experimentation [96.71370747681078]
We introduce MLAgentBench, a suite of 13 tasks ranging from improving model performance on CIFAR-10 to recent research problems like BabyLM. For each task, an agent can perform actions like reading/writing files, executing code, and inspecting outputs. We benchmark agents based on Claude v1.0, Claude v2.1, Claude v3 Opus, GPT-4, GPT-4-turbo, Gemini-Pro, and Mixtral and find that a Claude v3 Opus agent is the best in terms of success rate.
arXiv Detail & Related papers (2023-10-05T04:06:12Z)
A Dynamic LLM-Powered Agent Network for Task-Oriented Agent Collaboration [55.35849138235116]
We propose automatically selecting a team of agents from candidates to collaborate in a dynamic communication structure toward different tasks and domains. Specifically, we build a framework named Dynamic LLM-Powered Agent Network ($textDyLAN$) for LLM-powered agent collaboration. We demonstrate that DyLAN outperforms strong baselines in code generation, decision-making, general reasoning, and arithmetic reasoning tasks with moderate computational cost.
arXiv Detail & Related papers (2023-10-03T16:05:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.