Deep Reinforcement Learning for Uplink Scheduling in NOMA-URLLC Networks
- URL: http://arxiv.org/abs/2308.14523v1
- Date: Mon, 28 Aug 2023 12:18:02 GMT
- Title: Deep Reinforcement Learning for Uplink Scheduling in NOMA-URLLC Networks
- Authors: Beno\^it-Marie Robaglia, Marceau Coupechoux, Dimitrios Tsilimantos
- Abstract summary: This article addresses the problem of Ultra Reliable Low Communications (URLLC) in wireless networks, a framework with particularly stringent constraints imposed by many Internet of Things (IoT) applications from diverse sectors.
We propose a novel Deep Reinforcement Learning (DRL) scheduling algorithm, to solve the Non-Orthogonal Multiple Access (NOMA) uplink URLLC scheduling problem involving strict deadlines.
- Score: 7.182684187774442
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This article addresses the problem of Ultra Reliable Low Latency
Communications (URLLC) in wireless networks, a framework with particularly
stringent constraints imposed by many Internet of Things (IoT) applications
from diverse sectors. We propose a novel Deep Reinforcement Learning (DRL)
scheduling algorithm, named NOMA-PPO, to solve the Non-Orthogonal Multiple
Access (NOMA) uplink URLLC scheduling problem involving strict deadlines. The
challenge of addressing uplink URLLC requirements in NOMA systems is related to
the combinatorial complexity of the action space due to the possibility to
schedule multiple devices, and to the partial observability constraint that we
impose to our algorithm in order to meet the IoT communication constraints and
be scalable. Our approach involves 1) formulating the NOMA-URLLC problem as a
Partially Observable Markov Decision Process (POMDP) and the introduction of an
agent state, serving as a sufficient statistic of past observations and
actions, enabling a transformation of the POMDP into a Markov Decision Process
(MDP); 2) adapting the Proximal Policy Optimization (PPO) algorithm to handle
the combinatorial action space; 3) incorporating prior knowledge into the
learning agent with the introduction of a Bayesian policy. Numerical results
reveal that not only does our approach outperform traditional multiple access
protocols and DRL benchmarks on 3GPP scenarios, but also proves to be robust
under various channel and traffic configurations, efficiently exploiting
inherent time correlations.
Related papers
- Compiler for Distributed Quantum Computing: a Reinforcement Learning Approach [6.347685922582191]
We introduce a novel compiler that prioritizes reducing the expected execution time by jointly managing the generation and routing of EPR pairs.
We present a real-time, adaptive approach to compiler design, accounting for the nature of entanglement generation and the operational demands of quantum circuits.
Our contributions are twofold: (i) we model the optimal compiler for DQC using a Markov Decision Process (MDP) formulation, establishing the existence of an optimal algorithm, and (ii) we introduce a constrained Reinforcement Learning (RL) method to approximate this optimal compiler.
arXiv Detail & Related papers (2024-04-25T23:03:20Z) - Multi-Agent Reinforcement Learning for Network Routing in Integrated
Access Backhaul Networks [0.0]
We aim to maximize packet arrival ratio while minimizing their latency in IAB networks.
To solve this problem, we develop a multi-agent partially observed Markov decision process (POMD)
We show that A2C outperforms other reinforcement learning algorithms, leading to increased network efficiency and reduced selfish agent behavior.
arXiv Detail & Related papers (2023-05-12T13:03:26Z) - Semi-Infinitely Constrained Markov Decision Processes and Efficient
Reinforcement Learning [17.04643707688075]
We consider a continuum of constraints instead of a finite number of constraints as in the case of ordinary CMDPs.
We devise two reinforcement learning algorithms for SICMDPs that we call SI-CRL and SI-CPO.
To the best of our knowledge, we are the first to apply tools from semi-infinitely programming (SIP) to solve constrained reinforcement learning problems.
arXiv Detail & Related papers (2023-04-29T12:52:38Z) - Distributed-Training-and-Execution Multi-Agent Reinforcement Learning
for Power Control in HetNet [48.96004919910818]
We propose a multi-agent deep reinforcement learning (MADRL) based power control scheme for the HetNet.
To promote cooperation among agents, we develop a penalty-based Q learning (PQL) algorithm for MADRL systems.
In this way, an agent's policy can be learned by other agents more easily, resulting in a more efficient collaboration process.
arXiv Detail & Related papers (2022-12-15T17:01:56Z) - State-Augmented Learnable Algorithms for Resource Management in Wireless
Networks [124.89036526192268]
We propose a state-augmented algorithm for solving resource management problems in wireless networks.
We show that the proposed algorithm leads to feasible and near-optimal RRM decisions.
arXiv Detail & Related papers (2022-07-05T18:02:54Z) - How to Minimize the Weighted Sum AoI in Two-Source Status Update
Systems: OMA or NOMA? [12.041266020039822]
Two independent sources send update packets to a common destination node in a time-slotted manner under the limit of maximum retransmission rounds.
Different multiple access schemes are exploited here over a block-fading multiple access channel (MAC)
Online reinforcement learning approaches are proposed to achieve near-optimal age performance.
arXiv Detail & Related papers (2022-05-06T11:18:43Z) - Fidelity-Guarantee Entanglement Routing in Quantum Networks [64.49733801962198]
Entanglement routing establishes remote entanglement connection between two arbitrary nodes.
We propose purification-enabled entanglement routing designs to provide fidelity guarantee for multiple Source-Destination (SD) pairs in quantum networks.
arXiv Detail & Related papers (2021-11-15T14:07:22Z) - Deep Reinforcement Learning for Wireless Scheduling in Distributed Networked Control [37.10638636086814]
We consider a joint uplink and downlink scheduling problem of a fully distributed wireless control system (WNCS) with a limited number of frequency channels.
We develop a deep reinforcement learning (DRL) based framework for solving it.
To tackle the challenges of a large action space in DRL, we propose novel action space reduction and action embedding methods.
arXiv Detail & Related papers (2021-09-26T11:27:12Z) - Better than the Best: Gradient-based Improper Reinforcement Learning for
Network Scheduling [60.48359567964899]
We consider the problem of scheduling in constrained queueing networks with a view to minimizing packet delay.
We use a policy gradient based reinforcement learning algorithm that produces a scheduler that performs better than the available atomic policies.
arXiv Detail & Related papers (2021-05-01T10:18:34Z) - Combining Deep Learning and Optimization for Security-Constrained
Optimal Power Flow [94.24763814458686]
Security-constrained optimal power flow (SCOPF) is fundamental in power systems.
Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs.
This paper proposes a novel approach that combines deep learning and robust optimization techniques.
arXiv Detail & Related papers (2020-07-14T12:38:21Z) - RIS Enhanced Massive Non-orthogonal Multiple Access Networks: Deployment
and Passive Beamforming Design [116.88396201197533]
A novel framework is proposed for the deployment and passive beamforming design of a reconfigurable intelligent surface (RIS)
The problem of joint deployment, phase shift design, as well as power allocation is formulated for maximizing the energy efficiency.
A novel long short-term memory (LSTM) based echo state network (ESN) algorithm is proposed to predict users' tele-traffic demand by leveraging a real dataset.
A decaying double deep Q-network (D3QN) based position-acquisition and phase-control algorithm is proposed to solve the joint problem of deployment and design of the RIS.
arXiv Detail & Related papers (2020-01-28T14:37:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.