Multi-UAV Path Planning for Wireless Data Harvesting with Deep
Reinforcement Learning
- URL: http://arxiv.org/abs/2010.12461v3
- Date: Thu, 3 Jun 2021 11:38:05 GMT
- Title: Multi-UAV Path Planning for Wireless Data Harvesting with Deep
Reinforcement Learning
- Authors: Harald Bayerlein, Mirco Theile, Marco Caccamo, David Gesbert
- Abstract summary: We propose a multi-agent reinforcement learning (MARL) approach that can adapt to profound changes in the scenario parameters defining the data harvesting mission.
We show that our proposed network architecture enables the agents to cooperate effectively by carefully dividing the data collection task among themselves.
- Score: 18.266087952180733
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Harvesting data from distributed Internet of Things (IoT) devices with
multiple autonomous unmanned aerial vehicles (UAVs) is a challenging problem
requiring flexible path planning methods. We propose a multi-agent
reinforcement learning (MARL) approach that, in contrast to previous work, can
adapt to profound changes in the scenario parameters defining the data
harvesting mission, such as the number of deployed UAVs, number, position and
data amount of IoT devices, or the maximum flying time, without the need to
perform expensive recomputations or relearn control policies. We formulate the
path planning problem for a cooperative, non-communicating, and homogeneous
team of UAVs tasked with maximizing collected data from distributed IoT sensor
nodes subject to flying time and collision avoidance constraints. The path
planning problem is translated into a decentralized partially observable Markov
decision process (Dec-POMDP), which we solve through a deep reinforcement
learning (DRL) approach, approximating the optimal UAV control policy without
prior knowledge of the challenging wireless channel characteristics in dense
urban environments. By exploiting a combination of centered global and local
map representations of the environment that are fed into convolutional layers
of the agents, we show that our proposed network architecture enables the
agents to cooperate effectively by carefully dividing the data collection task
among themselves, adapt to large complex environments and state spaces, and
make movement decisions that balance data collection goals, flight-time
efficiency, and navigation constraints. Finally, learning a control policy that
generalizes over the scenario parameter space enables us to analyze the
influence of individual parameters on collection performance and provide some
intuition about system-level benefits.
Related papers
- Aerial Reliable Collaborative Communications for Terrestrial Mobile Users via Evolutionary Multi-Objective Deep Reinforcement Learning [59.660724802286865]
Unmanned aerial vehicles (UAVs) have emerged as the potential aerial base stations (BSs) to improve terrestrial communications.
This work employs collaborative beamforming through a UAV-enabled virtual antenna array to improve transmission performance from the UAV to terrestrial mobile users.
arXiv Detail & Related papers (2025-02-09T09:15:47Z) - Task Delay and Energy Consumption Minimization for Low-altitude MEC via Evolutionary Multi-objective Deep Reinforcement Learning [52.64813150003228]
The low-altitude economy (LAE), driven by unmanned aerial vehicles (UAVs) and other aircraft, has revolutionized fields such as transportation, agriculture, and environmental monitoring.
In the upcoming six-generation (6G) era, UAV-assisted mobile edge computing (MEC) is particularly crucial in challenging environments such as mountainous or disaster-stricken areas.
The task offloading problem is one of the key issues in UAV-assisted MEC, primarily addressing the trade-off between minimizing the task delay and the energy consumption of the UAV.
arXiv Detail & Related papers (2025-01-11T02:32:42Z) - Cluster-Based Multi-Agent Task Scheduling for Space-Air-Ground Integrated Networks [60.085771314013044]
Low-altitude economy holds significant potential for development in areas such as communication and sensing.
We propose a Clustering-based Multi-agent Deep Deterministic Policy Gradient (CMADDPG) algorithm to address the multi-UAV cooperative task scheduling challenges in SAGIN.
arXiv Detail & Related papers (2024-12-14T06:17:33Z) - SCoTT: Wireless-Aware Path Planning with Vision Language Models and Strategic Chains-of-Thought [78.53885607559958]
A novel approach using vision language models (VLMs) is proposed for enabling path planning in complex wireless-aware environments.
To this end, insights from a digital twin with real-world wireless ray tracing data are explored.
Results show that SCoTT achieves very close average path gains compared to DP-WA* while at the same time yielding consistently shorter path lengths.
arXiv Detail & Related papers (2024-11-27T10:45:49Z) - Multi-Agent Reinforcement Learning for Offloading Cellular Communications with Cooperating UAVs [21.195346908715972]
Unmanned aerial vehicles present an alternative means to offload data traffic from terrestrial BSs.
This paper presents a novel approach to efficiently serve multiple UAVs for data offloading from terrestrial BSs.
arXiv Detail & Related papers (2024-02-05T12:36:08Z) - Multi-Objective Optimization for UAV Swarm-Assisted IoT with Virtual
Antenna Arrays [55.736718475856726]
Unmanned aerial vehicle (UAV) network is a promising technology for assisting Internet-of-Things (IoT)
Existing UAV-assisted data harvesting and dissemination schemes require UAVs to frequently fly between the IoTs and access points.
We introduce collaborative beamforming into IoTs and UAVs simultaneously to achieve energy and time-efficient data harvesting and dissemination.
arXiv Detail & Related papers (2023-08-03T02:49:50Z) - Muti-Agent Proximal Policy Optimization For Data Freshness in
UAV-assisted Networks [4.042622147977782]
We focus on the case where the collected data is time-sensitive, and it is critical to maintain its timeliness.
Our objective is to optimally design the UAVs' trajectories and the subsets of visited IoT devices such as the global Age-of-Updates (AoU) is minimized.
arXiv Detail & Related papers (2023-03-15T15:03:09Z) - Adaptive Path Planning for UAVs for Multi-Resolution Semantic
Segmentation [28.104584236205405]
A key challenge is planning missions to maximize the value of acquired data in large environments.
This is, for example, relevant for monitoring agricultural fields.
We propose an online planning algorithm which adapts the UAV paths to obtain high-resolution semantic segmentations.
arXiv Detail & Related papers (2022-03-03T11:03:28Z) - UAV Path Planning using Global and Local Map Information with Deep
Reinforcement Learning [16.720630804675213]
This work presents a method for autonomous UAV path planning based on deep reinforcement learning (DRL)
We compare coverage path planning ( CPP), where the UAV's goal is to survey an area of interest to data harvesting (DH), where the UAV collects data from distributed Internet of Things (IoT) sensor devices.
By exploiting structured map information of the environment, we train double deep Q-networks (DDQNs) with identical architectures on both distinctly different mission scenarios.
arXiv Detail & Related papers (2020-10-14T09:59:10Z) - UAV Path Planning for Wireless Data Harvesting: A Deep Reinforcement
Learning Approach [18.266087952180733]
We propose a new end-to-end reinforcement learning approach to UAV-enabled data collection from Internet of Things (IoT) devices.
An autonomous drone is tasked with gathering data from distributed sensor nodes subject to limited flying time and obstacle avoidance.
We show that our proposed network architecture enables the agent to make movement decisions for a variety of scenario parameters.
arXiv Detail & Related papers (2020-07-01T15:14:16Z) - Data Freshness and Energy-Efficient UAV Navigation Optimization: A Deep
Reinforcement Learning Approach [88.45509934702913]
We design a navigation policy for multiple unmanned aerial vehicles (UAVs) where mobile base stations (BSs) are deployed.
We incorporate different contextual information such as energy and age of information (AoI) constraints to ensure the data freshness at the ground BS.
By applying the proposed trained model, an effective real-time trajectory policy for the UAV-BSs captures the observable network states over time.
arXiv Detail & Related papers (2020-02-21T07:29:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.