Dynamic Resource Management for Providing QoS in Drone Delivery Systems
- URL: http://arxiv.org/abs/2103.04015v1
- Date: Sat, 6 Mar 2021 03:11:07 GMT
- Title: Dynamic Resource Management for Providing QoS in Drone Delivery Systems
- Authors: Behzad Khamidehi, Majid Raeis, Elvino S. Sousa
- Abstract summary: We study the dynamic UAV assignment problem for a drone delivery system with the goal of providing measurable Quality of Service (QoS) guarantees.
We take a deep reinforcement learning approach to obtain a dynamic policy for the re-allocation of the UAVs.
We evaluate the performance of our proposed algorithm by considering three broad arrival classes, including Bernoulli, Time-Varying Bernoulli, and Markov-Modulated Bernoulli arrivals.
- Score: 2.578242050187029
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Drones have been considered as an alternative means of package delivery to
reduce the delivery cost and time. Due to the battery limitations, the drones
are best suited for last-mile delivery, i.e., the delivery from the package
distribution centers (PDCs) to the customers. Since a typical delivery system
consists of multiple PDCs, each having random and time-varying demands, the
dynamic drone-to-PDC allocation would be of great importance in meeting the
demand in an efficient manner. In this paper, we study the dynamic UAV
assignment problem for a drone delivery system with the goal of providing
measurable Quality of Service (QoS) guarantees. We adopt a queueing theoretic
approach to model the customer-service nature of the problem. Furthermore, we
take a deep reinforcement learning approach to obtain a dynamic policy for the
re-allocation of the UAVs. This policy guarantees a probabilistic upper-bound
on the queue length of the packages waiting in each PDC, which is beneficial
from both the service provider's and the customers' viewpoints. We evaluate the
performance of our proposed algorithm by considering three broad arrival
classes, including Bernoulli, Time-Varying Bernoulli, and Markov-Modulated
Bernoulli arrivals. Our results show that the proposed method outperforms the
baselines, particularly in scenarios with Time-Varying and Markov-Modulated
Bernoulli arrivals, which are more representative of real-world demand
patterns. Moreover, our algorithm satisfies the QoS constraints in all the
studied scenarios while minimizing the average number of UAVs in use.
Related papers
- Dynamic Demand Management for Parcel Lockers [0.0]
We develop a solution framework that orchestrates algorithmic techniques rooted in Sequential Decision Analytics and Reinforcement Learning.
Our innovative approach to combine these techniques enables us to address the strong interrelations between the two decision types.
Our computational study shows that our method outperforms a myopic benchmark by 13.7% and an industry-inspired policy by 12.6%.
arXiv Detail & Related papers (2024-09-08T11:38:48Z) - Testing Quantum and Simulated Annealers on the Drone Delivery Packing Problem [6.246837813122577]
Drone delivery packing problem (DDPP) arises in the context of logistics in response to an increasing demand in the delivery process along with the necessity of lowering human intervention.
We propose two alternative formulations of the DDPP as a quadratic unconstrained binary optimization (QUBO) problem.
We perform extensive experiments showing the advantages as well as the limitations of quantum annealers for this optimization problem.
arXiv Detail & Related papers (2024-06-12T17:16:02Z) - Differentially Private Deep Q-Learning for Pattern Privacy Preservation
in MEC Offloading [76.0572817182483]
attackers may eavesdrop on the offloading decisions to infer the edge server's (ES's) queue information and users' usage patterns.
We propose an offloading strategy which jointly minimizes the latency, ES's energy consumption, and task dropping rate, while preserving pattern privacy (PP)
We develop a Differential Privacy Deep Q-learning based Offloading (DP-DQO) algorithm to solve this problem while addressing the PP issue by injecting noise into the generated offloading decisions.
arXiv Detail & Related papers (2023-02-09T12:50:18Z) - Multiscale Adaptive Scheduling and Path-Planning for Power-Constrained
UAV-Relays via SMDPs [0.0]
We describe the orchestration of a decentralized swarm of rotary-wing UAV-relays, augmenting the coverage and service capabilities of a terrestrial base station.
Our goal is to minimize the time-average service latencies involved in handling transmission requests from ground users under Poisson arrivals.
We demonstrate that our framework offers superior performance vis-a-vis average service latencies and average per-UAV power consumption.
arXiv Detail & Related papers (2022-09-16T00:21:58Z) - Dealing with Sparse Rewards in Continuous Control Robotics via
Heavy-Tailed Policies [64.2210390071609]
We present a novel Heavy-Tailed Policy Gradient (HT-PSG) algorithm to deal with the challenges of sparse rewards in continuous control problems.
We show consistent performance improvement across all tasks in terms of high average cumulative reward.
arXiv Detail & Related papers (2022-06-12T04:09:39Z) - Off-line approximate dynamic programming for the vehicle routing problem
with stochastic customers and demands via decentralized decision-making [0.0]
This paper studies a variant of the vehicle routing problem (VRP) where both customer locations and demands are uncertain.
The objective is to maximize the served demands while fulfilling vehicle capacities and time restrictions.
We develop a Q-learning algorithm featuring state-of-the-art acceleration techniques such as Replay Memory and Double Q Network.
arXiv Detail & Related papers (2021-09-21T14:28:09Z) - A Deep Reinforcement Learning Approach for Constrained Online Logistics
Route Assignment [4.367543599338385]
It is crucial for the logistics industry on how to assign a candidate logistics route for each shipping parcel properly.
This online route-assignment problem can be viewed as a constrained online decision-making problem.
We develop a model-free DRL approach named PPO-RA, in which Proximal Policy Optimization (PPO) is improved with dedicated techniques to address the challenges for route assignment (RA)
arXiv Detail & Related papers (2021-09-08T07:27:39Z) - Efficient UAV Trajectory-Planning using Economic Reinforcement Learning [65.91405908268662]
We introduce REPlanner, a novel reinforcement learning algorithm inspired by economic transactions to distribute tasks between UAVs.
We formulate the path planning problem as a multi-agent economic game, where agents can cooperate and compete for resources.
As the system computes task distributions via UAV cooperation, it is highly resilient to any change in the swarm size.
arXiv Detail & Related papers (2021-03-03T20:54:19Z) - Modular Deep Reinforcement Learning for Continuous Motion Planning with
Temporal Logic [59.94347858883343]
This paper investigates the motion planning of autonomous dynamical systems modeled by Markov decision processes (MDP)
The novelty is to design an embedded product MDP (EP-MDP) between the LDGBA and the MDP.
The proposed LDGBA-based reward shaping and discounting schemes for the model-free reinforcement learning (RL) only depend on the EP-MDP states.
arXiv Detail & Related papers (2021-02-24T01:11:25Z) - MUSBO: Model-based Uncertainty Regularized and Sample Efficient Batch
Optimization for Deployment Constrained Reinforcement Learning [108.79676336281211]
Continuous deployment of new policies for data collection and online learning is either cost ineffective or impractical.
We propose a new algorithmic learning framework called Model-based Uncertainty regularized and Sample Efficient Batch Optimization.
Our framework discovers novel and high quality samples for each deployment to enable efficient data collection.
arXiv Detail & Related papers (2021-02-23T01:30:55Z) - Dynamic Bicycle Dispatching of Dockless Public Bicycle-sharing Systems
using Multi-objective Reinforcement Learning [79.61517670541863]
How to use AI to provide efficient bicycle dispatching solutions based on dynamic bicycle rental demand is an essential issue for dockless PBS (DL-PBS)
We propose a dynamic bicycle dispatching algorithm based on multi-objective reinforcement learning (MORL-BD) to provide the optimal bicycle dispatching solution for DL-PBS.
arXiv Detail & Related papers (2021-01-19T03:09:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.