DL-DRL: A double-level deep reinforcement learning approach for
large-scale task scheduling of multi-UAV
- URL: http://arxiv.org/abs/2208.02447v3
- Date: Tue, 6 Jun 2023 07:45:31 GMT
- Title: DL-DRL: A double-level deep reinforcement learning approach for
large-scale task scheduling of multi-UAV
- Authors: Xiao Mao, Zhiguang Cao, Mingfeng Fan, Guohua Wu, and Witold Pedrycz
- Abstract summary: We propose a double-level deep reinforcement learning (DL-DRL) approach based on a divide and conquer framework (DCF)
Particularly, we design an encoder-decoder structured policy network in our upper-level DRL model to allocate the tasks to different UAVs.
We also exploit another attention based policy network in our lower-level DRL model to construct the route for each UAV, with the objective to maximize the number of executed tasks.
- Score: 65.07776277630228
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Exploiting unmanned aerial vehicles (UAVs) to execute tasks is gaining
growing popularity recently. To solve the underlying task scheduling problem,
the deep reinforcement learning (DRL) based methods demonstrate notable
advantage over the conventional heuristics as they rely less on hand-engineered
rules. However, their decision space will become prohibitively huge as the
problem scales up, thus deteriorating the computation efficiency. To alleviate
this issue, we propose a double-level deep reinforcement learning (DL-DRL)
approach based on a divide and conquer framework (DCF), where we decompose the
task scheduling of multi-UAV into task allocation and route planning.
Particularly, we design an encoder-decoder structured policy network in our
upper-level DRL model to allocate the tasks to different UAVs, and we exploit
another attention based policy network in our lower-level DRL model to
construct the route for each UAV, with the objective to maximize the number of
executed tasks given the maximum flight distance of the UAV. To effectively
train the two models, we design an interactive training strategy (ITS), which
includes pre-training, intensive training and alternate training. Experimental
results show that our DL-DRL performs favorably against the learning-based and
conventional baselines including the OR-Tools, in terms of solution quality and
computation efficiency. We also verify the generalization performance of our
approach by applying it to larger sizes of up to 1000 tasks. Moreover, we also
show via an ablation study that our ITS can help achieve a balance between the
performance and training efficiency.
Related papers
- UAV-enabled Collaborative Beamforming via Multi-Agent Deep Reinforcement Learning [79.16150966434299]
We formulate a UAV-enabled collaborative beamforming multi-objective optimization problem (UCBMOP) to maximize the transmission rate of the UVAA and minimize the energy consumption of all UAVs.
We use the heterogeneous-agent trust region policy optimization (HATRPO) as the basic framework, and then propose an improved HATRPO algorithm, namely HATRPO-UCB.
arXiv Detail & Related papers (2024-04-11T03:19:22Z) - ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL [80.10358123795946]
We develop a framework for building multi-turn RL algorithms for fine-tuning large language models.
Our framework adopts a hierarchical RL approach and runs two RL algorithms in parallel.
Empirically, we find that ArCHer significantly improves efficiency and performance on agent tasks.
arXiv Detail & Related papers (2024-02-29T18:45:56Z) - RL-GPT: Integrating Reinforcement Learning and Code-as-policy [82.1804241891039]
We introduce a two-level hierarchical framework, RL-GPT, comprising a slow agent and a fast agent.
The slow agent analyzes actions suitable for coding, while the fast agent executes coding tasks.
This decomposition effectively focuses each agent on specific tasks, proving highly efficient within our pipeline.
arXiv Detail & Related papers (2024-02-29T16:07:22Z) - Enhancing Secrecy in UAV RSMA Networks: Deep Unfolding Meets Deep Reinforcement Learning [0.8287206589886881]
We consider the network of the secrecy in multiple unmanned aerial vehicles (UAV) rate trajectory (SMAR)
The proposed deep reinforcement learning (DRL) has shown great performance and outperformed other DRL-based methods in the literature.
arXiv Detail & Related papers (2023-09-30T12:26:24Z) - Muti-Agent Proximal Policy Optimization For Data Freshness in
UAV-assisted Networks [4.042622147977782]
We focus on the case where the collected data is time-sensitive, and it is critical to maintain its timeliness.
Our objective is to optimally design the UAVs' trajectories and the subsets of visited IoT devices such as the global Age-of-Updates (AoU) is minimized.
arXiv Detail & Related papers (2023-03-15T15:03:09Z) - Meta Reinforcement Learning with Successor Feature Based Context [51.35452583759734]
We propose a novel meta-RL approach that achieves competitive performance comparing to existing meta-RL algorithms.
Our method does not only learn high-quality policies for multiple tasks simultaneously but also can quickly adapt to new tasks with a small amount of training.
arXiv Detail & Related papers (2022-07-29T14:52:47Z) - Evolutionary Multi-Objective Reinforcement Learning Based Trajectory
Control and Task Offloading in UAV-Assisted Mobile Edge Computing [8.168647937560504]
This paper studies the trajectory control and task offloading (TCTO) problem in an unmanned aerial vehicle (UAV)-assisted mobile edge computing system.
It adapts the evolutionary multi-objective RL (EMORL), a multi-policy multi-objective RL, to the TCTO problem.
arXiv Detail & Related papers (2022-02-24T11:17:30Z) - Optimization for Master-UAV-powered Auxiliary-Aerial-IRS-assisted IoT
Networks: An Option-based Multi-agent Hierarchical Deep Reinforcement
Learning Approach [56.84948632954274]
This paper investigates a master unmanned aerial vehicle (MUAV)-powered Internet of Things (IoT) network.
We propose using a rechargeable auxiliary UAV (AUAV) equipped with an intelligent reflecting surface (IRS) to enhance the communication signals from the MUAV.
Under the proposed model, we investigate the optimal collaboration strategy of these energy-limited UAVs to maximize the accumulated throughput of the IoT network.
arXiv Detail & Related papers (2021-12-20T15:45:28Z) - Joint Cluster Head Selection and Trajectory Planning in UAV-Aided IoT
Networks by Reinforcement Learning with Sequential Model [4.273341750394231]
We formulate the problem of jointly designing the UAV's trajectory and selecting cluster heads in the Internet-of-Things network.
We propose a novel deep reinforcement learning (DRL) with a sequential model strategy that can effectively learn the policy represented by a sequence-to-sequence neural network.
Through extensive simulations, the obtained results show that the proposed DRL method can find the UAV's trajectory that requires much less energy consumption.
arXiv Detail & Related papers (2021-12-01T07:59:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.