Multi-Agent Reinforcement Learning for Unmanned Aerial Vehicle
Coordination by Multi-Critic Policy Gradient Optimization
- URL: http://arxiv.org/abs/2012.15472v1
- Date: Thu, 31 Dec 2020 07:00:44 GMT
- Title: Multi-Agent Reinforcement Learning for Unmanned Aerial Vehicle
Coordination by Multi-Critic Policy Gradient Optimization
- Authors: Yoav Alon and Huiyu Zhou
- Abstract summary: In agriculture, disaster management, search and rescue operations, commercial and military applications, the advantage of applying a fleet of drones originates from their ability to cooperate autonomously.
We propose a Multi-Agent Reinforcement Learning approach that achieves a stable policy network update and similarity in reward signal development for an increasing number of agents.
- Score: 16.6182621419268
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent technological progress in the development of Unmanned Aerial Vehicles
(UAVs) together with decreasing acquisition costs make the application of drone
fleets attractive for a wide variety of tasks. In agriculture, disaster
management, search and rescue operations, commercial and military applications,
the advantage of applying a fleet of drones originates from their ability to
cooperate autonomously. Multi-Agent Reinforcement Learning approaches that aim
to optimize a neural network based control policy, such as the best performing
actor-critic policy gradient algorithms, struggle to effectively back-propagate
errors of distinct rewards signal sources and tend to favor lucrative signals
while neglecting coordination and exploitation of previously learned
similarities. We propose a Multi-Critic Policy Optimization architecture with
multiple value estimating networks and a novel advantage function that
optimizes a stochastic actor policy network to achieve optimal coordination of
agents. Consequently, we apply the algorithm to several tasks that require the
collaboration of multiple drones in a physics-based reinforcement learning
environment. Our approach achieves a stable policy network update and
similarity in reward signal development for an increasing number of agents. The
resulting policy achieves optimal coordination and compliance with constraints
such as collision avoidance.
Related papers
- OPTIMA: Optimized Policy for Intelligent Multi-Agent Systems Enables Coordination-Aware Autonomous Vehicles [9.41740133451895]
This work introduces OPTIMA, a novel distributed reinforcement learning framework for cooperative autonomous vehicle tasks.
Our goal is to improve the generality and performance of CAVs in highly complex and crowded scenarios.
arXiv Detail & Related papers (2024-10-09T03:28:45Z) - Design Optimization of NOMA Aided Multi-STAR-RIS for Indoor Environments: A Convex Approximation Imitated Reinforcement Learning Approach [51.63921041249406]
Non-orthogonal multiple access (NOMA) enables multiple users to share the same frequency band, and simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS)
deploying STAR-RIS indoors presents challenges in interference mitigation, power consumption, and real-time configuration.
A novel network architecture utilizing multiple access points (APs), STAR-RISs, and NOMA is proposed for indoor communication.
arXiv Detail & Related papers (2024-06-19T07:17:04Z) - Joint Demonstration and Preference Learning Improves Policy Alignment with Human Feedback [58.049113055986375]
We develop a single stage approach named Alignment with Integrated Human Feedback (AIHF) to train reward models and the policy.
The proposed approach admits a suite of efficient algorithms, which can easily reduce to, and leverage, popular alignment algorithms.
We demonstrate the efficiency of the proposed solutions with extensive experiments involving alignment problems in LLMs and robotic control problems in MuJoCo.
arXiv Detail & Related papers (2024-06-11T01:20:53Z) - UAV-enabled Collaborative Beamforming via Multi-Agent Deep Reinforcement Learning [79.16150966434299]
We formulate a UAV-enabled collaborative beamforming multi-objective optimization problem (UCBMOP) to maximize the transmission rate of the UVAA and minimize the energy consumption of all UAVs.
We use the heterogeneous-agent trust region policy optimization (HATRPO) as the basic framework, and then propose an improved HATRPO algorithm, namely HATRPO-UCB.
arXiv Detail & Related papers (2024-04-11T03:19:22Z) - Joint User Association, Interference Cancellation and Power Control for
Multi-IRS Assisted UAV Communications [80.35959154762381]
Intelligent reflecting surface (IRS)-assisted unmanned aerial vehicle (UAV) communications are expected to alleviate the load of ground base stations in a cost-effective way.
Existing studies mainly focus on the deployment and resource allocation of a single IRS instead of multiple IRSs.
We propose a new optimization algorithm for joint IRS-user association, trajectory optimization of UAVs, successive interference cancellation (SIC) decoding order scheduling and power allocation.
arXiv Detail & Related papers (2023-12-08T01:57:10Z) - Muti-Agent Proximal Policy Optimization For Data Freshness in
UAV-assisted Networks [4.042622147977782]
We focus on the case where the collected data is time-sensitive, and it is critical to maintain its timeliness.
Our objective is to optimally design the UAVs' trajectories and the subsets of visited IoT devices such as the global Age-of-Updates (AoU) is minimized.
arXiv Detail & Related papers (2023-03-15T15:03:09Z) - Efficient Domain Coverage for Vehicles with Second-Order Dynamics via
Multi-Agent Reinforcement Learning [9.939081691797858]
We present a reinforcement learning (RL) approach for the multi-agent efficient domain coverage problem involving agents with second-order dynamics.
Our proposed network architecture includes the incorporation of LSTM and self-attention, which allows the trained policy to adapt to a variable number of agents.
arXiv Detail & Related papers (2022-11-11T01:59:12Z) - Hierarchical Reinforcement Learning with Opponent Modeling for
Distributed Multi-agent Cooperation [13.670618752160594]
Deep reinforcement learning (DRL) provides a promising approach for multi-agent cooperation through the interaction of the agents and environments.
Traditional DRL solutions suffer from the high dimensions of multiple agents with continuous action space during policy search.
We propose a hierarchical reinforcement learning approach with high-level decision-making and low-level individual control for efficient policy search.
arXiv Detail & Related papers (2022-06-25T19:09:29Z) - Optimization for Master-UAV-powered Auxiliary-Aerial-IRS-assisted IoT
Networks: An Option-based Multi-agent Hierarchical Deep Reinforcement
Learning Approach [56.84948632954274]
This paper investigates a master unmanned aerial vehicle (MUAV)-powered Internet of Things (IoT) network.
We propose using a rechargeable auxiliary UAV (AUAV) equipped with an intelligent reflecting surface (IRS) to enhance the communication signals from the MUAV.
Under the proposed model, we investigate the optimal collaboration strategy of these energy-limited UAVs to maximize the accumulated throughput of the IoT network.
arXiv Detail & Related papers (2021-12-20T15:45:28Z) - Semantic-Aware Collaborative Deep Reinforcement Learning Over Wireless
Cellular Networks [82.02891936174221]
Collaborative deep reinforcement learning (CDRL) algorithms in which multiple agents can coordinate over a wireless network is a promising approach.
In this paper, a novel semantic-aware CDRL method is proposed to enable a group of untrained agents with semantically-linked DRL tasks to collaborate efficiently across a resource-constrained wireless cellular network.
arXiv Detail & Related papers (2021-11-23T18:24:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.