Cooperative Target Detection with AUVs: A Dual-Timescale Hierarchical MARDL Approach
- URL: http://arxiv.org/abs/2509.13381v1
- Date: Tue, 16 Sep 2025 09:31:32 GMT
- Title: Cooperative Target Detection with AUVs: A Dual-Timescale Hierarchical MARDL Approach
- Authors: Zhang Xueyao, Yang Bo, Yu Zhiwen, Cao Xuelin, George C. Alexandropoulos, Merouane Debbah, Chau Yuen,
- Abstract summary: In adversarial environments, achieving efficient collaboration while ensuring covert operations is a key challenge for underwater cooperative missions.<n>We propose a novel dual time-scale Hierarchical Multi-Agent Proximal Policy Optimization framework.<n>We show that the proposed framework achieves rapid convergence, outperforms benchmark algorithms in terms of performance, and maximizes long-term cooperative efficiency while ensuring covert operations.
- Score: 59.81681228738068
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Autonomous Underwater Vehicles (AUVs) have shown great potential for cooperative detection and reconnaissance. However, collaborative AUV communications introduce risks of exposure. In adversarial environments, achieving efficient collaboration while ensuring covert operations becomes a key challenge for underwater cooperative missions. In this paper, we propose a novel dual time-scale Hierarchical Multi-Agent Proximal Policy Optimization (H-MAPPO) framework. The high-level component determines the individuals participating in the task based on a central AUV, while the low-level component reduces exposure probabilities through power and trajectory control by the participating AUVs. Simulation results show that the proposed framework achieves rapid convergence, outperforms benchmark algorithms in terms of performance, and maximizes long-term cooperative efficiency while ensuring covert operations.
Related papers
- SelfAI: Building a Self-Training AI System with LLM Agents [79.10991818561907]
SelfAI is a general multi-agent platform that combines a User Agent for translating high-level research objectives into standardized experimental configurations.<n>An Experiment Manager orchestrates parallel, fault-tolerant training across heterogeneous hardware while maintaining a structured knowledge base for continuous feedback.<n>Across regression, computer vision, scientific computing, medical imaging, and drug discovery benchmarks, SelfAI consistently achieves strong performance and reduces redundant trials.
arXiv Detail & Related papers (2025-11-29T09:18:39Z) - Hybrid Differential Reward: Combining Temporal Difference and Action Gradients for Efficient Multi-Agent Reinforcement Learning in Cooperative Driving [15.387374116985605]
In multi-vehicle cooperative driving tasks, traditional state-based reward functions suffer from vanishing reward differences.<n>This paper proposes a novel Hybrid Differential Reward mechanism to solve this problem.<n>It guides agents to learn high-quality cooperative policies that effectively traffic efficiency and safety.
arXiv Detail & Related papers (2025-11-21T02:58:04Z) - An LLM-based Framework for Human-Swarm Teaming Cognition in Disaster Search and Rescue [11.300720465575608]
Large-scale disaster Search And Rescue (SAR) operations are persistently challenged by complex terrain and disrupted communications.<n>While Unmanned Aerial Vehicle (UAV) swarms offer a promising solution for tasks like wide-area search and supply delivery, yet their effective coordination places a significant cognitive burden on human operators.<n>This study proposes a novel LLM-CRF system that leverages Large Language Models (LLMs) to model and augment human-swarm teaming cognition.
arXiv Detail & Related papers (2025-11-06T04:27:20Z) - Joint Optimization of Cooperation Efficiency and Communication Covertness for Target Detection with AUVs [105.81167650318054]
This paper investigates underwater cooperative target detection using autonomous underwater vehicles (AUVs)<n>We first formulate a joint trajectory and power control optimization problem, and then present an innovative hierarchical action management framework to solve it.<n>Under the centralized training and decentralized execution paradigm, our target detection framework enables adaptive covert cooperation while satisfying both energy and mobility constraints.
arXiv Detail & Related papers (2025-10-21T02:14:11Z) - LLM Meets the Sky: Heuristic Multi-Agent Reinforcement Learning for Secure Heterogeneous UAV Networks [57.27815890269697]
This work focuses on maximizing the secrecy rate in heterogeneous UAV networks (HetUAVNs) under energy constraints.<n>We introduce a Large Language Model (LLM)-guided multi-agent learning approach.<n>Results show that our method outperforms existing baselines in secrecy and energy efficiency.
arXiv Detail & Related papers (2025-07-23T04:22:57Z) - Aerial Secure Collaborative Communications under Eavesdropper Collusion in Low-altitude Economy: A Generative Swarm Intelligent Approach [84.20358039333756]
We introduce distributed collaborative beamforming (DCB) into AAV swarms and handle the eavesdropper collusion by controlling the corresponding signal distributions.<n>We minimize the two-way known secrecy capacity and maximum sidelobe level to avoid information leakage from the known and unknown eavesdroppers.<n>We propose a novel generative swarm intelligence (GenSI) framework to solve the problem with less overhead.
arXiv Detail & Related papers (2025-03-02T04:02:58Z) - CaPo: Cooperative Plan Optimization for Efficient Embodied Multi-Agent Cooperation [98.11670473661587]
CaPo improves cooperation efficiency with two phases: 1) meta-plan generation, and 2) progress-adaptive meta-plan and execution.<n> Experimental results on the ThreeDworld Multi-Agent Transport and Communicative Watch-And-Help tasks demonstrate that CaPo achieves much higher task completion rate and efficiency compared with state-of-the-arts.
arXiv Detail & Related papers (2024-11-07T13:08:04Z) - Guided Cooperation in Hierarchical Reinforcement Learning via Model-based Rollout [16.454305212398328]
We propose a goal-conditioned hierarchical reinforcement learning (HRL) framework named Guided Cooperation via Model-based Rollout (GCMR)
GCMR aims to bridge inter-layer information synchronization and cooperation by exploiting forward dynamics.
Experimental results demonstrate that incorporating the proposed GCMR framework with a disentangled variant of HIGL, namely ACLG, yields more stable and robust policy improvement.
arXiv Detail & Related papers (2023-09-24T00:13:16Z) - Practical Collaborative Perception: A Framework for Asynchronous and
Multi-Agent 3D Object Detection [9.967263440745432]
Occlusion is a major challenge for LiDAR-based object detection methods.
State-of-the-art V2X methods resolve the performance-bandwidth tradeoff using a mid-collaboration approach.
We devise a simple yet effective collaboration method that achieves a better bandwidth-performance tradeoff than prior methods.
arXiv Detail & Related papers (2023-07-04T03:49:42Z) - Boosting Value Decomposition via Unit-Wise Attentive State
Representation for Cooperative Multi-Agent Reinforcement Learning [11.843811402154408]
We propose a simple yet powerful method that alleviates partial observability and efficiently promotes coordination by the UNit-wise attentive State Representation (UNSR)
In UNSR, each agent learns a compact and disentangled unit-wise state representation outputted from transformer blocks, and produces its local action-value function.
Experimental results demonstrate that our method achieves superior performance and data efficiency compared to solid baselines on the Star IICraft micromanagement challenge.
arXiv Detail & Related papers (2023-05-12T00:33:22Z) - Modeling the Interaction between Agents in Cooperative Multi-Agent
Reinforcement Learning [2.9360071145551068]
We propose a novel cooperative MARL algorithm named as interactive actor-critic(IAC)
IAC models the interaction of agents from perspectives of policy and value function.
We extend the value decomposition methods to continuous control tasks and evaluate IAC on benchmark tasks including classic control and multi-agent particle environments.
arXiv Detail & Related papers (2021-02-10T01:58:28Z) - UneVEn: Universal Value Exploration for Multi-Agent Reinforcement
Learning [53.73686229912562]
We propose a novel MARL approach called Universal Value Exploration (UneVEn)
UneVEn learns a set of related tasks simultaneously with a linear decomposition of universal successor features.
Empirical results on a set of exploration games, challenging cooperative predator-prey tasks requiring significant coordination among agents, and StarCraft II micromanagement benchmarks show that UneVEn can solve tasks where other state-of-the-art MARL methods fail.
arXiv Detail & Related papers (2020-10-06T19:08:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.