Related papers: Scheduling and Power Control for Wireless Multicast Systems via Deep Reinforcement Learning

Scheduling and Power Control for Wireless Multicast Systems via Deep Reinforcement Learning

URL: http://arxiv.org/abs/2011.14799v1
Date: Sun, 27 Sep 2020 15:59:44 GMT
Title: Scheduling and Power Control for Wireless Multicast Systems via Deep Reinforcement Learning
Authors: Ramkumar Raghu, Mahadesh Panju, Vaneet Aggarwal and Vinod Sharma
Abstract summary: Multicasting in wireless systems is a way to exploit the redundancy in user requests in a Content Centric Network. Power control and optimal scheduling can significantly improve the wireless multicast network's performance under fading. We show that power control policy can be learnt for reasonably large systems via this approach.
Score: 33.737301955006345
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multicasting in wireless systems is a natural way to exploit the redundancy in user requests in a Content Centric Network. Power control and optimal scheduling can significantly improve the wireless multicast network's performance under fading. However, the model based approaches for power control and scheduling studied earlier are not scalable to large state space or changing system dynamics. In this paper, we use deep reinforcement learning where we use function approximation of the Q-function via a deep neural network to obtain a power control policy that matches the optimal policy for a small network. We show that power control policy can be learnt for reasonably large systems via this approach. Further we use multi-timescale stochastic optimization to maintain the average power constraint. We demonstrate that a slight modification of the learning algorithm allows tracking of time varying system statistics. Finally, we extend the multi-timescale approach to simultaneously learn the optimal queueing strategy along with power control. We demonstrate scalability, tracking and cross layer optimization capabilities of our algorithms via simulations. The proposed multi-timescale approach can be used in general large state space dynamical systems with multiple objectives and constraints, and may be of independent interest.

Related papers

Transformer-Based Power Optimization for Max-Min Fairness in Cell-Free Massive MIMO [10.638280710776918]
We propose a transformer neural network to jointly predict optimal uplink and downlink power using only user and access point positions. Numerical results show that the trained model provides near-optimal performance.
arXiv Detail & Related papers (2025-03-05T14:49:06Z)
Optimizing Load Scheduling in Power Grids Using Reinforcement Learning and Markov Decision Processes [0.0]
This paper proposes a reinforcement learning (RL) approach to address the challenges of dynamic load scheduling. Our results show that the RL-based method provides a robust and scalable solution for real-time load scheduling.
arXiv Detail & Related papers (2024-10-23T09:16:22Z)
Differentiable Discrete Event Simulation for Queuing Network Control [7.965453961211742]
Queueing network control poses distinct challenges, including highity, large state and action spaces, and lack of stability. We propose a scalable framework for policy optimization based on differentiable discrete event simulation. Our methods can flexibly handle realistic scenarios, including systems operating in non-stationary environments.
arXiv Detail & Related papers (2024-09-05T17:53:54Z)
Multi-agent Reinforcement Learning with Graph Q-Networks for Antenna Tuning [60.94661435297309]
The scale of mobile networks makes it challenging to optimize antenna parameters using manual intervention or hand-engineered strategies. We propose a new multi-agent reinforcement learning algorithm to optimize mobile network configurations globally. We empirically demonstrate the performance of the algorithm on an antenna tilt tuning problem and a joint tilt and power control problem in a simulated environment.
arXiv Detail & Related papers (2023-01-20T17:06:34Z)
Cell-Free Data Power Control Via Scalable Multi-Objective Bayesian Optimisation [0.0]
Cell-free multi-user multiple input multiple output networks are a promising alternative to classical cellular architectures. Previous works have developed radio resource management mechanisms using various optimisation engines. We consider the problem of overall ergodic spectral efficiency maximisation in the context of uplink-downlink data power control in cell-free networks.
arXiv Detail & Related papers (2022-12-20T14:46:44Z)
Distributed-Training-and-Execution Multi-Agent Reinforcement Learning for Power Control in HetNet [48.96004919910818]
We propose a multi-agent deep reinforcement learning (MADRL) based power control scheme for the HetNet. To promote cooperation among agents, we develop a penalty-based Q learning (PQL) algorithm for MADRL systems. In this way, an agent's policy can be learned by other agents more easily, resulting in a more efficient collaboration process.
arXiv Detail & Related papers (2022-12-15T17:01:56Z)
Hierarchical Multi-Agent DRL-Based Framework for Joint Multi-RAT Assignment and Dynamic Resource Allocation in Next-Generation HetNets [21.637440368520487]
This paper considers the problem of cost-aware downlink sum-rate via joint optimal radio access technologies (RATs) assignment and power allocation in next-generation wireless networks (HetNets) We propose a hierarchical multi-agent deep reinforcement learning (DRL) framework, called DeepRAT, to solve it efficiently and learn system dynamics. In particular, the DeepRAT framework decomposes the problem into two main stages; the RATs-EDs assignment stage, which implements a single-agent Deep Q Network algorithm, and the power allocation stage, which utilizes a multi-agent Deep Deterministic Policy Gradient
arXiv Detail & Related papers (2022-02-28T09:49:44Z)
Learning to Continuously Optimize Wireless Resource in a Dynamic Environment: A Bilevel Optimization Perspective [52.497514255040514]
This work develops a new approach that enables data-driven methods to continuously learn and optimize resource allocation strategies in a dynamic environment. We propose to build the notion of continual learning into wireless system design, so that the learning model can incrementally adapt to the new episodes. Our design is based on a novel bilevel optimization formulation which ensures certain fairness" across different data samples.
arXiv Detail & Related papers (2021-05-03T07:23:39Z)
Better than the Best: Gradient-based Improper Reinforcement Learning for Network Scheduling [60.48359567964899]
We consider the problem of scheduling in constrained queueing networks with a view to minimizing packet delay. We use a policy gradient based reinforcement learning algorithm that produces a scheduler that performs better than the available atomic policies.
arXiv Detail & Related papers (2021-05-01T10:18:34Z)
Deep Actor-Critic Learning for Distributed Power Control in Wireless Mobile Networks [5.930707872313038]
Deep reinforcement learning offers a model-free alternative to supervised deep learning and classical optimization. We present a distributively executed continuous power control algorithm with the help of deep actor-critic learning. We integrate the proposed power control algorithm to a time-slotted system where devices are mobile and channel conditions change rapidly.
arXiv Detail & Related papers (2020-09-14T18:29:12Z)
Learning High-Level Policies for Model Predictive Control [54.00297896763184]
Model Predictive Control (MPC) provides robust solutions to robot control tasks. We propose a self-supervised learning algorithm for learning a neural network high-level policy. We show that our approach can handle situations that are difficult for standard MPC.
arXiv Detail & Related papers (2020-07-20T17:12:34Z)
Online Reinforcement Learning Control by Direct Heuristic Dynamic Programming: from Time-Driven to Event-Driven [80.94390916562179]
Time-driven learning refers to the machine learning method that updates parameters in a prediction model continuously as new data arrives. It is desirable to prevent the time-driven dHDP from updating due to insignificant system event such as noise. We show how the event-driven dHDP algorithm works in comparison to the original time-driven dHDP.
arXiv Detail & Related papers (2020-06-16T05:51:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.