Knowledge-Assisted Deep Reinforcement Learning in 5G Scheduler Design:
From Theoretical Framework to Implementation
- URL: http://arxiv.org/abs/2009.08346v2
- Date: Wed, 3 Feb 2021 06:13:34 GMT
- Title: Knowledge-Assisted Deep Reinforcement Learning in 5G Scheduler Design:
From Theoretical Framework to Implementation
- Authors: Zhouyou Gu and Changyang She and Wibowo Hardjawana and Simon Lumb and
David McKechnie and Todd Essery and Branka Vucetic
- Abstract summary: We develop a knowledge-assisted deep reinforcement learning algorithm to design schedulers in 5G networks.
We show that a straightforward implementation of DDPG converges slowly, has a poor quality-of-service (QoS) performance, and cannot be implemented in real-world 5G systems.
- Score: 34.5517138843888
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we develop a knowledge-assisted deep reinforcement learning
(DRL) algorithm to design wireless schedulers in the fifth-generation (5G)
cellular networks with time-sensitive traffic. Since the scheduling policy is a
deterministic mapping from channel and queue states to scheduling actions, it
can be optimized by using deep deterministic policy gradient (DDPG). We show
that a straightforward implementation of DDPG converges slowly, has a poor
quality-of-service (QoS) performance, and cannot be implemented in real-world
5G systems, which are non-stationary in general. To address these issues, we
propose a theoretical DRL framework, where theoretical models from wireless
communications are used to formulate a Markov decision process in DRL. To
reduce the convergence time and improve the QoS of each user, we design a
knowledge-assisted DDPG (K-DDPG) that exploits expert knowledge of the
scheduler design problem, such as the knowledge of the QoS, the target
scheduling policy, and the importance of each training sample, determined by
the approximation error of the value function and the number of packet losses.
Furthermore, we develop an architecture for online training and inference,
where K-DDPG initializes the scheduler off-line and then fine-tunes the
scheduler online to handle the mismatch between off-line simulations and
non-stationary real-world systems. Simulation results show that our approach
reduces the convergence time of DDPG significantly and achieves better QoS than
existing schedulers (reducing 30% ~ 50% packet losses). Experimental results
show that with off-line initialization, our approach achieves better initial
QoS than random initialization and the online fine-tuning converges in few
minutes.
Related papers
- Intent-Aware DRL-Based Uplink Dynamic Scheduler for 5G-NR [30.146175299047325]
We investigate the problem of supporting Industrial Internet of Things user equipment (IIoT UEs) with intent (i.e., requested quality of service (QoS)) and random traffic arrival.
A deep reinforcement learning (DRL) based centralized dynamic scheduler for time-frequency resources is proposed to learn how to schedule the available communication resources.
arXiv Detail & Related papers (2024-03-27T08:57:15Z) - Learning Logic Specifications for Policy Guidance in POMDPs: an
Inductive Logic Programming Approach [57.788675205519986]
We learn high-quality traces from POMDP executions generated by any solver.
We exploit data- and time-efficient Indu Logic Programming (ILP) to generate interpretable belief-based policy specifications.
We show that learneds expressed in Answer Set Programming (ASP) yield performance superior to neural networks and similar to optimal handcrafted task-specifics within lower computational time.
arXiv Detail & Related papers (2024-02-29T15:36:01Z) - MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion
Control in Real Networks [63.24965775030673]
We propose a novel Reinforcement Learning (RL) approach to design generic Congestion Control (CC) algorithms.
Our solution, MARLIN, uses the Soft Actor-Critic algorithm to maximize both entropy and return.
We trained MARLIN on a real network with varying background traffic patterns to overcome the sim-to-real mismatch.
arXiv Detail & Related papers (2023-02-02T18:27:20Z) - Graph Reinforcement Learning-based CNN Inference Offloading in Dynamic
Edge Computing [93.67044879636093]
This paper addresses the computational offloading of CNN inference in dynamic multi-access edge computing (MEC) networks.
We propose a graph reinforcement learning-based early-exit mechanism (GRLE) which outperforms the state-of-the-art work.
The experimental results show that GRLE achieves the average accuracy up to 3.41x over graph reinforcement learning (GRL) and 1.45x over DROOE.
arXiv Detail & Related papers (2022-10-24T07:17:20Z) - GCNScheduler: Scheduling Distributed Computing Applications using Graph
Convolutional Networks [12.284934135116515]
We propose a graph convolutional network-based scheduler (GCNScheduler)
By carefully integrating an inter-task data dependency structure with network settings into an input graph, the GCNScheduler can efficiently schedule tasks for a given objective.
We show that it better makespan than the classic HEFT algorithm, and almost the same throughput as throughput-oriented HEFT.
arXiv Detail & Related papers (2021-10-22T01:54:10Z) - Deep Reinforcement Learning for Wireless Scheduling in Distributed Networked Control [37.10638636086814]
We consider a joint uplink and downlink scheduling problem of a fully distributed wireless control system (WNCS) with a limited number of frequency channels.
We develop a deep reinforcement learning (DRL) based framework for solving it.
To tackle the challenges of a large action space in DRL, we propose novel action space reduction and action embedding methods.
arXiv Detail & Related papers (2021-09-26T11:27:12Z) - Better than the Best: Gradient-based Improper Reinforcement Learning for
Network Scheduling [60.48359567964899]
We consider the problem of scheduling in constrained queueing networks with a view to minimizing packet delay.
We use a policy gradient based reinforcement learning algorithm that produces a scheduler that performs better than the available atomic policies.
arXiv Detail & Related papers (2021-05-01T10:18:34Z) - Smart Scheduling based on Deep Reinforcement Learning for Cellular
Networks [18.04856086228028]
We propose a smart scheduling scheme based on deep reinforcement learning (DRL)
We provide implementation-friend designs, i.e., a scalable neural network design for the agent and a virtual environment training framework.
We show that the DRL-based smart scheduling outperforms the conventional scheduling method and can be adopted in practical systems.
arXiv Detail & Related papers (2021-03-22T02:09:16Z) - Online Reinforcement Learning Control by Direct Heuristic Dynamic
Programming: from Time-Driven to Event-Driven [80.94390916562179]
Time-driven learning refers to the machine learning method that updates parameters in a prediction model continuously as new data arrives.
It is desirable to prevent the time-driven dHDP from updating due to insignificant system event such as noise.
We show how the event-driven dHDP algorithm works in comparison to the original time-driven dHDP.
arXiv Detail & Related papers (2020-06-16T05:51:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.