Agentic UAVs: LLM-Driven Autonomy with Integrated Tool-Calling and Cognitive Reasoning
- URL: http://arxiv.org/abs/2509.13352v1
- Date: Sun, 14 Sep 2025 08:46:40 GMT
- Title: Agentic UAVs: LLM-Driven Autonomy with Integrated Tool-Calling and Cognitive Reasoning
- Authors: Anis Koubaa, Khaled Gabr,
- Abstract summary: Existing UAV frameworks lack context-aware reasoning, autonomous decision-making, and ecosystem-level integration.<n>This paper introduces the Agentic UAVs framework, a five-layer architecture (Perception, Reasoning, Action, Integration, Learning)<n>A ROS2 and Gazebo-based prototype integrates YOLOv11 object detection with GPT-4 reasoning and local Gemma-3 deployment.
- Score: 3.4643961367503575
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Unmanned Aerial Vehicles (UAVs) are increasingly deployed in defense, surveillance, and disaster response, yet most systems remain confined to SAE Level 2--3 autonomy. Their reliance on rule-based control and narrow AI restricts adaptability in dynamic, uncertain missions. Existing UAV frameworks lack context-aware reasoning, autonomous decision-making, and ecosystem-level integration; critically, none leverage Large Language Model (LLM) agents with tool-calling for real-time knowledge access. This paper introduces the Agentic UAVs framework, a five-layer architecture (Perception, Reasoning, Action, Integration, Learning) that augments UAVs with LLM-driven reasoning, database querying, and third-party system interaction. A ROS2 and Gazebo-based prototype integrates YOLOv11 object detection with GPT-4 reasoning and local Gemma-3 deployment. In simulated search-and-rescue scenarios, agentic UAVs achieved higher detection confidence (0.79 vs. 0.72), improved person detection rates (91% vs. 75%), and markedly increased action recommendation (92% vs. 4.5%). These results confirm that modest computational overhead enables qualitatively new levels of autonomy and ecosystem integration.
Related papers
- $α^3$-SecBench: A Large-Scale Evaluation Suite of Security, Resilience, and Trust for LLM-based UAV Agents over 6G Networks [3.099103925863002]
We introduce $3$-SecBench, the first large-scale evaluation suite for assessing the security-aware autonomy of LLM-based UAV agents under realistic adversarial interference.<n>We evaluate 23 state-of-the-art LLMs from major industrial providers and leading AI labs using thousands of adversarially augmented UAV episodes sampled from a corpus of 113,475 missions spanning 175 threat types. Normalized overall scores range from 12.9% to 57.1%, highlighting a significant gap between anomaly detection and security-aware autonomous decision-making.
arXiv Detail & Related papers (2026-01-26T18:25:07Z) - Agentic AI Meets Edge Computing in Autonomous UAV Swarms [3.9444299467643025]
Agentic AI, powered by large language models (LLMs), with autonomous reasoning, planning, and execution, opens new operational possibilities.<n>However, infrastructure constraints, dynamic environments, and the computational demands of multi-agent coordination limit real-world deployment.<n>This paper investigates the integration of LLM-based agentic AI and edge computing to realize scalable and resilient autonomy in UAV swarms.
arXiv Detail & Related papers (2026-01-20T19:45:33Z) - AerialMind: Towards Referring Multi-Object Tracking in UAV Scenarios [64.51320327698231]
We introduce AerialMind, the first large-scale RMOT benchmark in UAV scenarios.<n>We develop an innovative semi-automated collaborative agent-based labeling assistant framework.<n>We also propose HawkEyeTrack, a novel method that collaboratively enhances vision-language representation learning.
arXiv Detail & Related papers (2025-11-26T04:44:27Z) - UAVBench: An Open Benchmark Dataset for Autonomous and Agentic AI UAV Systems via LLM-Generated Flight Scenarios [3.099103925863002]
UAVBench is an open benchmarking benchmark for UAV flight scenarios generated through large language models (LLMs)<n>We present UAVBench_MCQ, a reasoning-oriented extension containing 50,000 multiple-choice questions spanning ten cognitive and ethical reasoning styles.<n>We evaluate 32 state-of-the-art LLMs, including GPT-5, ChatGPT-4o, Gemini 2.5 Flash, DeepSeek V3, Q3wenwenB, and ERNIE 4.5 300B, and find strong performance in perception and policy reasoning but persistent challenges in ethics-aware and resource-constrained decision-making.
arXiv Detail & Related papers (2025-11-14T12:51:48Z) - Alpamayo-R1: Bridging Reasoning and Action Prediction for Generalizable Autonomous Driving in the Long Tail [85.47497935739936]
Alpamayo-R1 (AR1) is a vision-language-action model that integrates Chain of Causation reasoning with trajectory planning.<n>We show AR1 achieves 12% improvement in planning accuracy on challenging cases compared to a trajectory-only baseline.<n>We plan to release AR1 models and a subset of the CoC in a future update.
arXiv Detail & Related papers (2025-10-30T01:25:34Z) - AirFed: Federated Graph-Enhanced Multi-Agent Reinforcement Learning for Multi-UAV Cooperative Mobile Edge Computing [21.4907371859268]
Multiple Unmanned Aerial Vehicles (UAVs) cooperative Mobile Edge Computing (MEC) systems face critical challenges in coordinating trajectory planning, task offloading, and resource allocation.<n>Existing approaches suffer from limited scalability, slow convergence, and inefficient knowledge sharing among UAVs.<n>This paper proposes AirFed, a novel federated graph-enhanced multi-agent reinforcement learning framework.
arXiv Detail & Related papers (2025-10-27T06:31:35Z) - When UAV Swarm Meets IRS: Collaborative Secure Communications in Low-altitude Wireless Networks [68.45202147860537]
Low-altitude wireless networks (LAWNs) provide enhanced coverage, reliability, and throughput for diverse applications.<n>These networks face significant security vulnerabilities from both known and potential unknown eavesdroppers.<n>We propose a novel secure communication framework for LAWNs where the selected UAVs within a swarm function as a virtual antenna array.
arXiv Detail & Related papers (2025-10-25T02:02:14Z) - LIMI: Less is More for Agency [49.63355240818081]
LIMI (Less Is More for Intelligent Agency) demonstrates that agency follows radically different development principles.<n>We show that sophisticated agentic intelligence can emerge from minimal but strategically curated demonstrations of autonomous behavior.<n>Our findings establish the Agency Efficiency Principle: machine autonomy emerges not from data abundance but from strategic curation of high-quality agentic demonstrations.
arXiv Detail & Related papers (2025-09-22T10:59:32Z) - An Automated Attack Investigation Approach Leveraging Threat-Knowledge-Augmented Large Language Models [17.220143037047627]
Advanced Persistent Threats (APTs) compromise high-value systems to steal data or disrupt operations.<n>Existing methods suffer from poor platform generality, limited generalization to evolving tactics, and an inability to produce analyst-ready reports.<n>We present an LLM-empowered attack investigation framework augmented with a dynamically adaptable Kill-Chain-aligned threat knowledge base.
arXiv Detail & Related papers (2025-09-01T08:57:01Z) - OmniEAR: Benchmarking Agent Reasoning in Embodied Tasks [52.87238755666243]
We present OmniEAR, a framework for evaluating how language models reason about physical interactions, tool usage, and multi-agent coordination in embodied tasks.<n>We model continuous physical properties and complex spatial relationships across 1,500 scenarios spanning household and industrial domains.<n>Our systematic evaluation reveals severe performance degradation when models must reason from constraints.
arXiv Detail & Related papers (2025-08-07T17:54:15Z) - LLM Meets the Sky: Heuristic Multi-Agent Reinforcement Learning for Secure Heterogeneous UAV Networks [57.27815890269697]
This work focuses on maximizing the secrecy rate in heterogeneous UAV networks (HetUAVNs) under energy constraints.<n>We introduce a Large Language Model (LLM)-guided multi-agent learning approach.<n>Results show that our method outperforms existing baselines in secrecy and energy efficiency.
arXiv Detail & Related papers (2025-07-23T04:22:57Z) - AegisLLM: Scaling Agentic Systems for Self-Reflective Defense in LLM Security [74.22452069013289]
AegisLLM is a cooperative multi-agent defense against adversarial attacks and information leakage.<n>We show that scaling agentic reasoning system at test-time substantially enhances robustness without compromising model utility.<n> Comprehensive evaluations across key threat scenarios, including unlearning and jailbreaking, demonstrate the effectiveness of AegisLLM.
arXiv Detail & Related papers (2025-04-29T17:36:05Z) - Generative Diffusion-based Contract Design for Efficient AI Twins Migration in Vehicular Embodied AI Networks [55.15079732226397]
Embodied AI is a rapidly advancing field that bridges the gap between cyberspace and physical space.
In VEANET, embodied AI twins act as in-vehicle AI assistants to perform diverse tasks supporting autonomous driving.
arXiv Detail & Related papers (2024-10-02T02:20:42Z) - Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents [44.34340798542]
Large Language Models (LLMs) have shown remarkable capabilities in natural language tasks requiring complex reasoning.
Traditional supervised pre-training on static datasets falls short in enabling autonomous agent capabilities.
We propose a framework that combines guided Monte Carlo Tree Search (MCTS) search with a self-critique mechanism and iterative fine-tuning on agent interactions.
arXiv Detail & Related papers (2024-08-13T20:52:13Z) - Efficient UAV Trajectory-Planning using Economic Reinforcement Learning [65.91405908268662]
We introduce REPlanner, a novel reinforcement learning algorithm inspired by economic transactions to distribute tasks between UAVs.
We formulate the path planning problem as a multi-agent economic game, where agents can cooperate and compete for resources.
As the system computes task distributions via UAV cooperation, it is highly resilient to any change in the swarm size.
arXiv Detail & Related papers (2021-03-03T20:54:19Z) - Distributed Variable-Baseline Stereo SLAM from two UAVs [17.513645771137178]
In this article, we employ two UAVs equipped with one monocular camera and one IMU each, to exploit their view overlap and relative distance measurements.
In order to control the glsuav agents autonomously, we propose a decentralized collaborative estimation scheme.
We demonstrate the effectiveness of the approach at high altitude flights of up to 160m, going significantly beyond the capabilities of state-of-the-art VIO methods.
arXiv Detail & Related papers (2020-09-10T12:16:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.