Distributed Area Coverage with High Altitude Balloons Using Multi-Agent Reinforcement Learning
- URL: http://arxiv.org/abs/2510.03823v1
- Date: Sat, 04 Oct 2025 14:39:45 GMT
- Title: Distributed Area Coverage with High Altitude Balloons Using Multi-Agent Reinforcement Learning
- Authors: Adam Haroon, Tristan Schuler,
- Abstract summary: High Altitude Balloons (HABs) can leverage stratospheric wind layers for limited horizontal control, enabling applications in reconnaissance, environmental monitoring, and communications networks.<n>Existing multi-agent HAB coordination approaches use deterministic methods like Voron partitioningoi and extremum seeking control for large global constellations.<n>This work presents the first systematic application of multi-agent reinforcement learning (MARL) to HAB coordination for distributed area coverage.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: High Altitude Balloons (HABs) can leverage stratospheric wind layers for limited horizontal control, enabling applications in reconnaissance, environmental monitoring, and communications networks. Existing multi-agent HAB coordination approaches use deterministic methods like Voronoi partitioning and extremum seeking control for large global constellations, which perform poorly for smaller teams and localized missions. While single-agent HAB control using reinforcement learning has been demonstrated on HABs, coordinated multi-agent reinforcement learning (MARL) has not yet been investigated. This work presents the first systematic application of multi-agent reinforcement learning (MARL) to HAB coordination for distributed area coverage. We extend our previously developed reinforcement learning simulation environment (RLHAB) to support cooperative multi-agent learning, enabling multiple agents to operate simultaneously in realistic atmospheric conditions. We adapt QMIX for HAB area coverage coordination, leveraging Centralized Training with Decentralized Execution to address atmospheric vehicle coordination challenges. Our approach employs specialized observation spaces providing individual state, environmental context, and teammate data, with hierarchical rewards prioritizing coverage while encouraging spatial distribution. We demonstrate that QMIX achieves similar performance to the theoretically optimal geometric deterministic method for distributed area coverage, validating the MARL approach and providing a foundation for more complex autonomous multi-HAB missions where deterministic methods become intractable.
Related papers
- Hybrid Belief Reinforcement Learning for Efficient Coordinated Spatial Exploration [3.0222726254970174]
Pure model-based approaches provide structured uncertainty estimates but lack adaptive policy learning.<n>This paper presents a hybrid belief-reinforcement learning framework to address this gap.<n>Results show 10.8% higher cumulative reward and 38% faster convergence over baselines.
arXiv Detail & Related papers (2026-03-04T00:00:34Z) - Multi-Agent Deep Reinforcement Learning Under Constrained Communications [2.7126292487109005]
We present a distributed multi-agent reinforcement learning (MARL) framework that removes the need for centralized critics or global information.<n>We develop a novel Graph Attention Network (D-GAT) that performs global state inference through multi-hop communication.<n>We also develop the distributed graph-attention MAPPO (DG-MAPPO) -- a distributed MARL framework where agents optimize local policies and value functions.
arXiv Detail & Related papers (2026-01-22T21:07:18Z) - PC2P: Multi-Agent Path Finding via Personalized-Enhanced Communication and Crowd Perception [12.114711272142031]
PC2P is a novel distributed MAPF method derived from a Q-learning-based MARL framework.<n>We introduce a personalized-enhanced communication mechanism based on dynamic graph topology.<n>To resolve extreme deadlock issues, we propose a region-based deadlock-breaking strategy.
arXiv Detail & Related papers (2026-01-06T03:11:26Z) - Decentralized Consensus Inference-based Hierarchical Reinforcement Learning for Multi-Constrained UAV Pursuit-Evasion Game [0.0]
The Cooperative Evasion and Formation Coverage task belongs to one of the most challenging issues in pursuit-evasion games (MC-PEG)<n>We propose a novel two-level framework, which delegates localization to a high-level policy, while adopting a low-level policy to manage obstacle avoidance, navigation, and formation.<n>The experimental results, including the high-fidelity software-in-the-loop (SITL) simulations, validate that CI-HRL provides a superior solution with enhanced swarm's collaborative evasion and task completion capabilities.
arXiv Detail & Related papers (2025-06-22T18:23:58Z) - Benchmarking LLMs' Swarm intelligence [51.648605206159125]
Large Language Models (LLMs) show potential for complex reasoning, yet their capacity for emergent coordination in Multi-Agent Systems (MAS) remains largely unexplored.<n>We introduce SwarmBench, a novel benchmark designed to systematically evaluate tasks of LLMs acting as decentralized agents.<n>We propose metrics for coordination effectiveness and analyze emergent group dynamics.
arXiv Detail & Related papers (2025-05-07T12:32:01Z) - Cluster-Based Multi-Agent Task Scheduling for Space-Air-Ground Integrated Networks [60.085771314013044]
Low-altitude economy holds significant potential for development in areas such as communication and sensing.<n>We propose a Clustering-based Multi-agent Deep Deterministic Policy Gradient (CMADDPG) algorithm to address the multi-UAV cooperative task scheduling challenges in SAGIN.
arXiv Detail & Related papers (2024-12-14T06:17:33Z) - ComaDICE: Offline Cooperative Multi-Agent Reinforcement Learning with Stationary Distribution Shift Regularization [11.620274237352026]
offline reinforcement learning (RL) has garnered significant attention for its ability to learn effective policies from pre-collected datasets.
MARL presents additional challenges due to the large joint state-action space and the complexity of multi-agent behaviors.
We introduce a regularizer in the space of stationary distributions to better handle distributional shift.
arXiv Detail & Related papers (2024-10-02T18:56:10Z) - Multi-agent Deep Covering Skill Discovery [50.812414209206054]
We propose Multi-agent Deep Covering Option Discovery, which constructs the multi-agent options through minimizing the expected cover time of the multiple agents' joint state space.
Also, we propose a novel framework to adopt the multi-agent options in the MARL process.
We show that the proposed algorithm can effectively capture the agent interactions with the attention mechanism, successfully identify multi-agent options, and significantly outperforms prior works using single-agent options or no options.
arXiv Detail & Related papers (2022-10-07T00:40:59Z) - Stateful active facilitator: Coordination and Environmental
Heterogeneity in Cooperative Multi-Agent Reinforcement Learning [71.53769213321202]
We formalize the notions of coordination level and heterogeneity level of an environment.
We present HECOGrid, a suite of multi-agent environments that facilitates empirical evaluation of different MARL approaches.
We propose a Training Decentralized Execution learning approach that enables agents to work efficiently in high-coordination and high-heterogeneity environments.
arXiv Detail & Related papers (2022-10-04T18:17:01Z) - Toward multi-target self-organizing pursuit in a partially observable
Markov game [34.22625222101752]
This work proposes a framework for decentralized multi-agent systems to improve the implicit coordination capabilities in search and pursuit.
We model a self-organizing system as a partially observable Markov game (POMG) featured by large-scale, decentralization, partial observation, and noncommunication.
The proposed distributed algorithm: fuzzy self-organizing cooperative coevolution (FSC2) is then leveraged to resolve the three challenges in multi-target SOP.
arXiv Detail & Related papers (2022-06-24T14:59:56Z) - Scalable Multi-Agent Model-Based Reinforcement Learning [1.95804735329484]
We propose a new method called MAMBA which utilizes Model-Based Reinforcement Learning (MBRL) to further leverage centralized training in cooperative environments.
We argue that communication between agents is enough to sustain a world model for each agent during execution phase while imaginary rollouts can be used for training, removing the necessity to interact with the environment.
arXiv Detail & Related papers (2022-05-25T08:35:00Z) - Locality Matters: A Scalable Value Decomposition Approach for
Cooperative Multi-Agent Reinforcement Learning [52.7873574425376]
Cooperative multi-agent reinforcement learning (MARL) faces significant scalability issues due to state and action spaces that are exponentially large in the number of agents.
We propose a novel, value-based multi-agent algorithm called LOMAQ, which incorporates local rewards in the Training Decentralized Execution paradigm.
arXiv Detail & Related papers (2021-09-22T10:08:15Z) - F2A2: Flexible Fully-decentralized Approximate Actor-critic for
Cooperative Multi-agent Reinforcement Learning [110.35516334788687]
Decentralized multi-agent reinforcement learning algorithms are sometimes unpractical in complicated applications.
We propose a flexible fully decentralized actor-critic MARL framework, which can handle large-scale general cooperative multi-agent setting.
Our framework can achieve scalability and stability for large-scale environment and reduce information transmission.
arXiv Detail & Related papers (2020-04-17T14:56:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.