Related papers: Knowledge-Assisted Deep Reinforcement Learning in 5G Scheduler Design: From Theoretical Framework to Implementation

Knowledge-Assisted Deep Reinforcement Learning in 5G Scheduler Design: From Theoretical Framework to Implementation

URL: http://arxiv.org/abs/2009.08346v2
Date: Wed, 3 Feb 2021 06:13:34 GMT
Title: Knowledge-Assisted Deep Reinforcement Learning in 5G Scheduler Design: From Theoretical Framework to Implementation
Authors: Zhouyou Gu and Changyang She and Wibowo Hardjawana and Simon Lumb and David McKechnie and Todd Essery and Branka Vucetic
Abstract summary: We develop a knowledge-assisted deep reinforcement learning algorithm to design schedulers in 5G networks. We show that a straightforward implementation of DDPG converges slowly, has a poor quality-of-service (QoS) performance, and cannot be implemented in real-world 5G systems.
Score: 34.5517138843888
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we develop a knowledge-assisted deep reinforcement learning (DRL) algorithm to design wireless schedulers in the fifth-generation (5G) cellular networks with time-sensitive traffic. Since the scheduling policy is a deterministic mapping from channel and queue states to scheduling actions, it can be optimized by using deep deterministic policy gradient (DDPG). We show that a straightforward implementation of DDPG converges slowly, has a poor quality-of-service (QoS) performance, and cannot be implemented in real-world 5G systems, which are non-stationary in general. To address these issues, we propose a theoretical DRL framework, where theoretical models from wireless communications are used to formulate a Markov decision process in DRL. To reduce the convergence time and improve the QoS of each user, we design a knowledge-assisted DDPG (K-DDPG) that exploits expert knowledge of the scheduler design problem, such as the knowledge of the QoS, the target scheduling policy, and the importance of each training sample, determined by the approximation error of the value function and the number of packet losses. Furthermore, we develop an architecture for online training and inference, where K-DDPG initializes the scheduler off-line and then fine-tunes the scheduler online to handle the mismatch between off-line simulations and non-stationary real-world systems. Simulation results show that our approach reduces the convergence time of DDPG significantly and achieves better QoS than existing schedulers (reducing 30% ~ 50% packet losses). Experimental results show that with off-line initialization, our approach achieves better initial QoS than random initialization and the online fine-tuning converges in few minutes.

Related papers

Intent-Aware DRL-Based Uplink Dynamic Scheduler for 5G-NR [30.146175299047325]
We investigate the problem of supporting Industrial Internet of Things user equipment (IIoT UEs) with intent (i.e., requested quality of service (QoS)) and random traffic arrival. A deep reinforcement learning (DRL) based centralized dynamic scheduler for time-frequency resources is proposed to learn how to schedule the available communication resources.
arXiv Detail & Related papers (2024-03-27T08:57:15Z)
Learning Logic Specifications for Policy Guidance in POMDPs: an Inductive Logic Programming Approach [57.788675205519986]
We learn high-quality traces from POMDP executions generated by any solver. We exploit data- and time-efficient Indu Logic Programming (ILP) to generate interpretable belief-based policy specifications. We show that learneds expressed in Answer Set Programming (ASP) yield performance superior to neural networks and similar to optimal handcrafted task-specifics within lower computational time.
arXiv Detail & Related papers (2024-02-29T15:36:01Z)
MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion Control in Real Networks [63.24965775030673]
We propose a novel Reinforcement Learning (RL) approach to design generic Congestion Control (CC) algorithms. Our solution, MARLIN, uses the Soft Actor-Critic algorithm to maximize both entropy and return. We trained MARLIN on a real network with varying background traffic patterns to overcome the sim-to-real mismatch.
arXiv Detail & Related papers (2023-02-02T18:27:20Z)
Structure-Enhanced DRL for Optimal Transmission Scheduling [43.801422320012286]
This paper focuses on the transmission scheduling problem of a remote estimation system. We develop a structure-enhanced deep reinforcement learning framework for optimal scheduling of the system. In particular, we propose a structure-enhanced action selection method, which tends to select actions that obey the policy structure.
arXiv Detail & Related papers (2022-12-24T10:18:38Z)
Graph Reinforcement Learning-based CNN Inference Offloading in Dynamic Edge Computing [93.67044879636093]
This paper addresses the computational offloading of CNN inference in dynamic multi-access edge computing (MEC) networks. We propose a graph reinforcement learning-based early-exit mechanism (GRLE) which outperforms the state-of-the-art work. The experimental results show that GRLE achieves the average accuracy up to 3.41x over graph reinforcement learning (GRL) and 1.45x over DROOE.
arXiv Detail & Related papers (2022-10-24T07:17:20Z)
GCNScheduler: Scheduling Distributed Computing Applications using Graph Convolutional Networks [12.284934135116515]
We propose a graph convolutional network-based scheduler (GCNScheduler) By carefully integrating an inter-task data dependency structure with network settings into an input graph, the GCNScheduler can efficiently schedule tasks for a given objective. We show that it better makespan than the classic HEFT algorithm, and almost the same throughput as throughput-oriented HEFT.
arXiv Detail & Related papers (2021-10-22T01:54:10Z)
Deep Reinforcement Learning for Wireless Scheduling in Distributed Networked Control [37.10638636086814]
We consider a joint uplink and downlink scheduling problem of a fully distributed wireless control system (WNCS) with a limited number of frequency channels. We develop a deep reinforcement learning (DRL) based framework for solving it. To tackle the challenges of a large action space in DRL, we propose novel action space reduction and action embedding methods.
arXiv Detail & Related papers (2021-09-26T11:27:12Z)
Better than the Best: Gradient-based Improper Reinforcement Learning for Network Scheduling [60.48359567964899]
We consider the problem of scheduling in constrained queueing networks with a view to minimizing packet delay. We use a policy gradient based reinforcement learning algorithm that produces a scheduler that performs better than the available atomic policies.
arXiv Detail & Related papers (2021-05-01T10:18:34Z)
Smart Scheduling based on Deep Reinforcement Learning for Cellular Networks [18.04856086228028]
We propose a smart scheduling scheme based on deep reinforcement learning (DRL) We provide implementation-friend designs, i.e., a scalable neural network design for the agent and a virtual environment training framework. We show that the DRL-based smart scheduling outperforms the conventional scheduling method and can be adopted in practical systems.
arXiv Detail & Related papers (2021-03-22T02:09:16Z)
Online Reinforcement Learning Control by Direct Heuristic Dynamic Programming: from Time-Driven to Event-Driven [80.94390916562179]
Time-driven learning refers to the machine learning method that updates parameters in a prediction model continuously as new data arrives. It is desirable to prevent the time-driven dHDP from updating due to insignificant system event such as noise. We show how the event-driven dHDP algorithm works in comparison to the original time-driven dHDP.
arXiv Detail & Related papers (2020-06-16T05:51:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.