Distributed-Training-and-Execution Multi-Agent Reinforcement Learning
for Power Control in HetNet
- URL: http://arxiv.org/abs/2212.07967v1
- Date: Thu, 15 Dec 2022 17:01:56 GMT
- Title: Distributed-Training-and-Execution Multi-Agent Reinforcement Learning
for Power Control in HetNet
- Authors: Kaidi Xu, Nguyen Van Huynh, Geoffrey Ye Li
- Abstract summary: We propose a multi-agent deep reinforcement learning (MADRL) based power control scheme for the HetNet.
To promote cooperation among agents, we develop a penalty-based Q learning (PQL) algorithm for MADRL systems.
In this way, an agent's policy can be learned by other agents more easily, resulting in a more efficient collaboration process.
- Score: 48.96004919910818
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In heterogeneous networks (HetNets), the overlap of small cells and the macro
cell causes severe cross-tier interference. Although there exist some
approaches to address this problem, they usually require global channel state
information, which is hard to obtain in practice, and get the sub-optimal power
allocation policy with high computational complexity. To overcome these
limitations, we propose a multi-agent deep reinforcement learning (MADRL) based
power control scheme for the HetNet, where each access point makes power
control decisions independently based on local information. To promote
cooperation among agents, we develop a penalty-based Q learning (PQL) algorithm
for MADRL systems. By introducing regularization terms in the loss function,
each agent tends to choose an experienced action with high reward when
revisiting a state, and thus the policy updating speed slows down. In this way,
an agent's policy can be learned by other agents more easily, resulting in a
more efficient collaboration process. We then implement the proposed PQL in the
considered HetNet and compare it with other distributed-training-and-execution
(DTE) algorithms. Simulation results show that our proposed PQL can learn the
desired power control policy from a dynamic environment where the locations of
users change episodically and outperform existing DTE MADRL algorithms.
Related papers
- State and Action Factorization in Power Grids [47.65236082304256]
We propose a domain-agnostic algorithm that estimates correlations between state and action components entirely based on data.
The algorithm is validated on a power grid benchmark obtained with the Grid2Op simulator.
arXiv Detail & Related papers (2024-09-03T15:00:58Z) - Design Optimization of NOMA Aided Multi-STAR-RIS for Indoor Environments: A Convex Approximation Imitated Reinforcement Learning Approach [51.63921041249406]
Non-orthogonal multiple access (NOMA) enables multiple users to share the same frequency band, and simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS)
deploying STAR-RIS indoors presents challenges in interference mitigation, power consumption, and real-time configuration.
A novel network architecture utilizing multiple access points (APs), STAR-RISs, and NOMA is proposed for indoor communication.
arXiv Detail & Related papers (2024-06-19T07:17:04Z) - Deployable Reinforcement Learning with Variable Control Rate [14.838483990647697]
We propose a variant of Reinforcement Learning (RL) with variable control rate.
In this approach, the policy decides the action the agent should take as well as the duration of the time step associated with that action.
We show the efficacy of SEAC through a proof-of-concept simulation driving an agent with Newtonian kinematics.
arXiv Detail & Related papers (2024-01-17T15:40:11Z) - Learning RL-Policies for Joint Beamforming Without Exploration: A Batch
Constrained Off-Policy Approach [1.0080317855851213]
We consider the problem of network parameter cancellation optimization for networks.
We show that deploying an algorithm in the real world for exploration and learning can be achieved with the data without exploring.
arXiv Detail & Related papers (2023-10-12T18:36:36Z) - Scalable and Sample Efficient Distributed Policy Gradient Algorithms in
Multi-Agent Networked Systems [12.327745531583277]
We name it REC-MARL standing for REward-Coupled Multi-Agent Reinforcement Learning.
REC-MARL has a range of important applications such as real-time access control and distributed power control in wireless networks.
arXiv Detail & Related papers (2022-12-13T03:44:00Z) - Computation Offloading and Resource Allocation in F-RANs: A Federated
Deep Reinforcement Learning Approach [67.06539298956854]
fog radio access network (F-RAN) is a promising technology in which the user mobile devices (MDs) can offload computation tasks to the nearby fog access points (F-APs)
arXiv Detail & Related papers (2022-06-13T02:19:20Z) - Hierarchical Multi-Agent DRL-Based Framework for Joint Multi-RAT
Assignment and Dynamic Resource Allocation in Next-Generation HetNets [21.637440368520487]
This paper considers the problem of cost-aware downlink sum-rate via joint optimal radio access technologies (RATs) assignment and power allocation in next-generation wireless networks (HetNets)
We propose a hierarchical multi-agent deep reinforcement learning (DRL) framework, called DeepRAT, to solve it efficiently and learn system dynamics.
In particular, the DeepRAT framework decomposes the problem into two main stages; the RATs-EDs assignment stage, which implements a single-agent Deep Q Network algorithm, and the power allocation stage, which utilizes a multi-agent Deep Deterministic Policy Gradient
arXiv Detail & Related papers (2022-02-28T09:49:44Z) - Semantic-Aware Collaborative Deep Reinforcement Learning Over Wireless
Cellular Networks [82.02891936174221]
Collaborative deep reinforcement learning (CDRL) algorithms in which multiple agents can coordinate over a wireless network is a promising approach.
In this paper, a novel semantic-aware CDRL method is proposed to enable a group of untrained agents with semantically-linked DRL tasks to collaborate efficiently across a resource-constrained wireless cellular network.
arXiv Detail & Related papers (2021-11-23T18:24:47Z) - Better than the Best: Gradient-based Improper Reinforcement Learning for
Network Scheduling [60.48359567964899]
We consider the problem of scheduling in constrained queueing networks with a view to minimizing packet delay.
We use a policy gradient based reinforcement learning algorithm that produces a scheduler that performs better than the available atomic policies.
arXiv Detail & Related papers (2021-05-01T10:18:34Z) - Deep Actor-Critic Learning for Distributed Power Control in Wireless
Mobile Networks [5.930707872313038]
Deep reinforcement learning offers a model-free alternative to supervised deep learning and classical optimization.
We present a distributively executed continuous power control algorithm with the help of deep actor-critic learning.
We integrate the proposed power control algorithm to a time-slotted system where devices are mobile and channel conditions change rapidly.
arXiv Detail & Related papers (2020-09-14T18:29:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.