Multi-Agent Reinforcement Learning with Action Masking for UAV-enabled
Mobile Communications
- URL: http://arxiv.org/abs/2303.16737v2
- Date: Tue, 19 Dec 2023 11:55:14 GMT
- Title: Multi-Agent Reinforcement Learning with Action Masking for UAV-enabled
Mobile Communications
- Authors: Danish Rizvi, David Boyle
- Abstract summary: Unmanned Aerial Vehicles (UAVs) are increasingly used as aerial base stations to provide ad hoc communications infrastructure.
This paper focuses on the use of multiple UAVs for providing wireless communication to mobile users in the absence of terrestrial communications infrastructure.
We jointly optimize UAV 3D trajectory and NOMA power allocation to maximize system throughput.
- Score: 1.3053649021965603
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unmanned Aerial Vehicles (UAVs) are increasingly used as aerial base stations
to provide ad hoc communications infrastructure. Building upon prior research
efforts which consider either static nodes, 2D trajectories or single UAV
systems, this paper focuses on the use of multiple UAVs for providing wireless
communication to mobile users in the absence of terrestrial communications
infrastructure. In particular, we jointly optimize UAV 3D trajectory and NOMA
power allocation to maximize system throughput. Firstly, a weighted
K-means-based clustering algorithm establishes UAV-user associations at regular
intervals. The efficacy of training a novel Shared Deep Q-Network (SDQN) with
action masking is then explored. Unlike training each UAV separately using DQN,
the SDQN reduces training time by using the experiences of multiple UAVs
instead of a single agent. We also show that SDQN can be used to train a
multi-agent system with differing action spaces. Simulation results confirm
that: 1) training a shared DQN outperforms a conventional DQN in terms of
maximum system throughput (+20%) and training time (-10%); 2) it can converge
for agents with different action spaces, yielding a 9% increase in throughput
compared to mutual learning algorithms; and 3) combining NOMA with an SDQN
architecture enables the network to achieve a better sum rate compared with
existing baseline schemes.
Related papers
- UAV-enabled Collaborative Beamforming via Multi-Agent Deep Reinforcement Learning [79.16150966434299]
We formulate a UAV-enabled collaborative beamforming multi-objective optimization problem (UCBMOP) to maximize the transmission rate of the UVAA and minimize the energy consumption of all UAVs.
We use the heterogeneous-agent trust region policy optimization (HATRPO) as the basic framework, and then propose an improved HATRPO algorithm, namely HATRPO-UCB.
arXiv Detail & Related papers (2024-04-11T03:19:22Z) - Multi-Agent Reinforcement Learning for Offloading Cellular Communications with Cooperating UAVs [21.195346908715972]
Unmanned aerial vehicles present an alternative means to offload data traffic from terrestrial BSs.
This paper presents a novel approach to efficiently serve multiple UAVs for data offloading from terrestrial BSs.
arXiv Detail & Related papers (2024-02-05T12:36:08Z) - Optimization for Master-UAV-powered Auxiliary-Aerial-IRS-assisted IoT
Networks: An Option-based Multi-agent Hierarchical Deep Reinforcement
Learning Approach [56.84948632954274]
This paper investigates a master unmanned aerial vehicle (MUAV)-powered Internet of Things (IoT) network.
We propose using a rechargeable auxiliary UAV (AUAV) equipped with an intelligent reflecting surface (IRS) to enhance the communication signals from the MUAV.
Under the proposed model, we investigate the optimal collaboration strategy of these energy-limited UAVs to maximize the accumulated throughput of the IoT network.
arXiv Detail & Related papers (2021-12-20T15:45:28Z) - 3D UAV Trajectory and Data Collection Optimisation via Deep
Reinforcement Learning [75.78929539923749]
Unmanned aerial vehicles (UAVs) are now beginning to be deployed for enhancing the network performance and coverage in wireless communication.
It is challenging to obtain an optimal resource allocation scheme for the UAV-assisted Internet of Things (IoT)
In this paper, we design a new UAV-assisted IoT systems relying on the shortest flight path of the UAVs while maximising the amount of data collected from IoT devices.
arXiv Detail & Related papers (2021-06-06T14:08:41Z) - Efficient UAV Trajectory-Planning using Economic Reinforcement Learning [65.91405908268662]
We introduce REPlanner, a novel reinforcement learning algorithm inspired by economic transactions to distribute tasks between UAVs.
We formulate the path planning problem as a multi-agent economic game, where agents can cooperate and compete for resources.
As the system computes task distributions via UAV cooperation, it is highly resilient to any change in the swarm size.
arXiv Detail & Related papers (2021-03-03T20:54:19Z) - Privacy-Preserving Federated Learning for UAV-Enabled Networks:
Learning-Based Joint Scheduling and Resource Management [45.15174235000158]
Unmanned aerial vehicles (UAVs) are capable of serving as flying base stations (BSs) for supporting data collection, artificial intelligence (AI) model training, and wireless communications.
It is impractical to send raw data of devices to UAV servers for model training.
In this paper, we develop an asynchronous federated learning framework for multi-UAV-enabled networks.
arXiv Detail & Related papers (2020-11-28T18:58:34Z) - Multi-Agent Reinforcement Learning in NOMA-aided UAV Networks for
Cellular Offloading [59.32570888309133]
A novel framework is proposed for cellular offloading with the aid of multiple unmanned aerial vehicles (UAVs)
Non-orthogonal multiple access (NOMA) technique is employed at each UAV to further improve the spectrum efficiency of the wireless network.
A mutual deep Q-network (MDQN) algorithm is proposed to jointly determine the optimal 3D trajectory and power allocation of UAVs.
arXiv Detail & Related papers (2020-10-18T20:22:05Z) - NOMA in UAV-aided cellular offloading: A machine learning approach [59.32570888309133]
A novel framework is proposed for cellular offloading with the aid of multiple unmanned aerial vehicles (UAVs)
Non-orthogonal multiple access (NOMA) technique is employed at each UAV to further improve the spectrum efficiency of the wireless network.
A mutual deep Q-network (MDQN) algorithm is proposed to jointly determine the optimal 3D trajectory and power allocation of UAVs.
arXiv Detail & Related papers (2020-10-18T17:38:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.