Related papers: DL-DRL: A double-level deep reinforcement learning approach for large-scale task scheduling of multi-UAV

DL-DRL: A double-level deep reinforcement learning approach for large-scale task scheduling of multi-UAV

URL: http://arxiv.org/abs/2208.02447v3
Date: Tue, 6 Jun 2023 07:45:31 GMT
Title: DL-DRL: A double-level deep reinforcement learning approach for large-scale task scheduling of multi-UAV
Authors: Xiao Mao, Zhiguang Cao, Mingfeng Fan, Guohua Wu, and Witold Pedrycz
Abstract summary: We propose a double-level deep reinforcement learning (DL-DRL) approach based on a divide and conquer framework (DCF) Particularly, we design an encoder-decoder structured policy network in our upper-level DRL model to allocate the tasks to different UAVs. We also exploit another attention based policy network in our lower-level DRL model to construct the route for each UAV, with the objective to maximize the number of executed tasks.
Score: 65.07776277630228
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Exploiting unmanned aerial vehicles (UAVs) to execute tasks is gaining growing popularity recently. To solve the underlying task scheduling problem, the deep reinforcement learning (DRL) based methods demonstrate notable advantage over the conventional heuristics as they rely less on hand-engineered rules. However, their decision space will become prohibitively huge as the problem scales up, thus deteriorating the computation efficiency. To alleviate this issue, we propose a double-level deep reinforcement learning (DL-DRL) approach based on a divide and conquer framework (DCF), where we decompose the task scheduling of multi-UAV into task allocation and route planning. Particularly, we design an encoder-decoder structured policy network in our upper-level DRL model to allocate the tasks to different UAVs, and we exploit another attention based policy network in our lower-level DRL model to construct the route for each UAV, with the objective to maximize the number of executed tasks given the maximum flight distance of the UAV. To effectively train the two models, we design an interactive training strategy (ITS), which includes pre-training, intensive training and alternate training. Experimental results show that our DL-DRL performs favorably against the learning-based and conventional baselines including the OR-Tools, in terms of solution quality and computation efficiency. We also verify the generalization performance of our approach by applying it to larger sizes of up to 1000 tasks. Moreover, we also show via an ablation study that our ITS can help achieve a balance between the performance and training efficiency.

Related papers

Bigger, Regularized, Categorical: High-Capacity Value Functions are Efficient Multi-Task Learners [60.75160178669076]
We show that the use of high-capacity value models trained via cross-entropy and conditioned on learnable task embeddings addresses the problem of task interference in online reinforcement learning.<n>We test our approach on 7 multi-task benchmarks with over 280 unique tasks, spanning high degree-of-freedom humanoid control and discrete vision-based RL.
arXiv Detail & Related papers (2025-05-29T06:41:45Z)
Efficient UAV Swarm-Based Multi-Task Federated Learning with Dynamic Task Knowledge Sharing [13.143754448388927]
In disaster relief scenarios, UAVs perform tasks such as crowd detection, road feasibility analysis, and disaster assessment. In this paper, we propose a UAV swarm based multi-task FL framework, where ground emergency vehicles (EVs) collaborate with UAVs to accomplish multiple tasks efficiently.
arXiv Detail & Related papers (2025-03-12T08:13:39Z)
Cluster-Based Multi-Agent Task Scheduling for Space-Air-Ground Integrated Networks [60.085771314013044]
Low-altitude economy holds significant potential for development in areas such as communication and sensing. We propose a Clustering-based Multi-agent Deep Deterministic Policy Gradient (CMADDPG) algorithm to address the multi-UAV cooperative task scheduling challenges in SAGIN.
arXiv Detail & Related papers (2024-12-14T06:17:33Z)
Multi-agent Path Finding for Timed Tasks using Evolutionary Games [1.3023548510259344]
We show that our algorithm is faster than deep RL methods by at least an order of magnitude. Our results indicate that it scales better with an increase in the number of agents as compared to other methods.
arXiv Detail & Related papers (2024-11-15T20:10:25Z)
UAV-enabled Collaborative Beamforming via Multi-Agent Deep Reinforcement Learning [79.16150966434299]
We formulate a UAV-enabled collaborative beamforming multi-objective optimization problem (UCBMOP) to maximize the transmission rate of the UVAA and minimize the energy consumption of all UAVs. We use the heterogeneous-agent trust region policy optimization (HATRPO) as the basic framework, and then propose an improved HATRPO algorithm, namely HATRPO-UCB.
arXiv Detail & Related papers (2024-04-11T03:19:22Z)
ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL [80.10358123795946]
We develop a framework for building multi-turn RL algorithms for fine-tuning large language models. Our framework adopts a hierarchical RL approach and runs two RL algorithms in parallel. Empirically, we find that ArCHer significantly improves efficiency and performance on agent tasks.
arXiv Detail & Related papers (2024-02-29T18:45:56Z)
RL-GPT: Integrating Reinforcement Learning and Code-as-policy [82.1804241891039]
We introduce a two-level hierarchical framework, RL-GPT, comprising a slow agent and a fast agent. The slow agent analyzes actions suitable for coding, while the fast agent executes coding tasks. This decomposition effectively focuses each agent on specific tasks, proving highly efficient within our pipeline.
arXiv Detail & Related papers (2024-02-29T16:07:22Z)
Enhancing Secrecy in UAV RSMA Networks: Deep Unfolding Meets Deep Reinforcement Learning [0.8287206589886881]
We consider the network of the secrecy in multiple unmanned aerial vehicles (UAV) rate trajectory (SMAR) The proposed deep reinforcement learning (DRL) has shown great performance and outperformed other DRL-based methods in the literature.
arXiv Detail & Related papers (2023-09-30T12:26:24Z)
Muti-Agent Proximal Policy Optimization For Data Freshness in UAV-assisted Networks [4.042622147977782]
We focus on the case where the collected data is time-sensitive, and it is critical to maintain its timeliness. Our objective is to optimally design the UAVs' trajectories and the subsets of visited IoT devices such as the global Age-of-Updates (AoU) is minimized.
arXiv Detail & Related papers (2023-03-15T15:03:09Z)
Meta Reinforcement Learning with Successor Feature Based Context [51.35452583759734]
We propose a novel meta-RL approach that achieves competitive performance comparing to existing meta-RL algorithms. Our method does not only learn high-quality policies for multiple tasks simultaneously but also can quickly adapt to new tasks with a small amount of training.
arXiv Detail & Related papers (2022-07-29T14:52:47Z)
Evolutionary Multi-Objective Reinforcement Learning Based Trajectory Control and Task Offloading in UAV-Assisted Mobile Edge Computing [8.168647937560504]
This paper studies the trajectory control and task offloading (TCTO) problem in an unmanned aerial vehicle (UAV)-assisted mobile edge computing system. It adapts the evolutionary multi-objective RL (EMORL), a multi-policy multi-objective RL, to the TCTO problem.
arXiv Detail & Related papers (2022-02-24T11:17:30Z)
Optimization for Master-UAV-powered Auxiliary-Aerial-IRS-assisted IoT Networks: An Option-based Multi-agent Hierarchical Deep Reinforcement Learning Approach [56.84948632954274]
This paper investigates a master unmanned aerial vehicle (MUAV)-powered Internet of Things (IoT) network. We propose using a rechargeable auxiliary UAV (AUAV) equipped with an intelligent reflecting surface (IRS) to enhance the communication signals from the MUAV. Under the proposed model, we investigate the optimal collaboration strategy of these energy-limited UAVs to maximize the accumulated throughput of the IoT network.
arXiv Detail & Related papers (2021-12-20T15:45:28Z)
Joint Cluster Head Selection and Trajectory Planning in UAV-Aided IoT Networks by Reinforcement Learning with Sequential Model [4.273341750394231]
We formulate the problem of jointly designing the UAV's trajectory and selecting cluster heads in the Internet-of-Things network. We propose a novel deep reinforcement learning (DRL) with a sequential model strategy that can effectively learn the policy represented by a sequence-to-sequence neural network. Through extensive simulations, the obtained results show that the proposed DRL method can find the UAV's trajectory that requires much less energy consumption.
arXiv Detail & Related papers (2021-12-01T07:59:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.