Multi-agent deep reinforcement learning (MADRL) meets multi-user MIMO
systems
- URL: http://arxiv.org/abs/2109.04986v1
- Date: Fri, 10 Sep 2021 16:50:45 GMT
- Title: Multi-agent deep reinforcement learning (MADRL) meets multi-user MIMO
systems
- Authors: Heunchul Lee, Jaeseong Jeong
- Abstract summary: We present a MADRL-based approach that can jointly optimize precoders to achieve the outer-boundary, called pareto-boundary, of the achievable rate region.
We will also address a phase ambiguity issue with the conventional complex baseband representation of signals widely used in radio communications.
To the best of our knowledge, this is the first work to demonstrate that the MA-DDPG framework can jointly optimize precoders to achieve the pareto-boundary of achievable rate region.
- Score: 0.3883460584034765
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: A multi-agent deep reinforcement learning (MADRL) is a promising approach to
challenging problems in wireless environments involving multiple
decision-makers (or actors) with high-dimensional continuous action space. In
this paper, we present a MADRL-based approach that can jointly optimize
precoders to achieve the outer-boundary, called pareto-boundary, of the
achievable rate region for a multiple-input single-output (MISO) interference
channel (IFC). In order to address two main challenges, namely, multiple actors
(or agents) with partial observability and multi-dimensional continuous action
space in MISO IFC setup, we adopt a multi-agent deep deterministic policy
gradient (MA-DDPG) framework in which decentralized actors with partial
observability can learn a multi-dimensional continuous policy in a centralized
manner with the aid of shared critic with global information. Meanwhile, we
will also address a phase ambiguity issue with the conventional complex
baseband representation of signals widely used in radio communications. In
order to mitigate the impact of phase ambiguity on training performance, we
propose a training method, called phase ambiguity elimination (PAE), that leads
to faster learning and better performance of MA-DDPG in wireless communication
systems. The simulation results exhibit that MA-DDPG is capable of learning a
near-optimal precoding strategy in a MISO IFC environment. To the best of our
knowledge, this is the first work to demonstrate that the MA-DDPG framework can
jointly optimize precoders to achieve the pareto-boundary of achievable rate
region in a multi-cell multi-user multi-antenna system.
Related papers
- Ensembling Prioritized Hybrid Policies for Multi-agent Pathfinding [18.06081009550052]
Multi-Agent Reinforcement Learning (MARL) based Multi-Agent Path Finding (MAPF) has recently gained attention due to its efficiency and scalability.
Several MARL-MAPF methods choose to use communication to enrich the information one agent can perceive.
We propose a new method, Ensembling Prioritized Hybrid Policies (EPH)
arXiv Detail & Related papers (2024-03-12T11:47:12Z) - HiMAP: Learning Heuristics-Informed Policies for Large-Scale Multi-Agent
Pathfinding [16.36594480478895]
Heuristics-Informed Multi-Agent Pathfinding (HiMAP)
Heuristics-Informed Multi-Agent Pathfinding (HiMAP)
arXiv Detail & Related papers (2024-02-23T13:01:13Z) - Effective Multi-Agent Deep Reinforcement Learning Control with Relative
Entropy Regularization [6.441951360534903]
Multi-Agent Continuous Dynamic Policy Gradient (MACDPP) was proposed to tackle the issues of limited capability and sample efficiency in various scenarios controlled by multiple agents.
It alleviates the inconsistency of multiple agents' policy updates by introducing the relative entropy regularization to the Training with Decentralized Execution (CTDE) framework with the Actor-Critic (AC) structure.
arXiv Detail & Related papers (2023-09-26T07:38:19Z) - Large AI Model Empowered Multimodal Semantic Communications [51.17527319441436]
We propose a Large AI Model-based Multimodal SC (LAM-MSC) framework.
We first present the SC-based Multimodal Alignment (MMA)
Then, a personalized LLM-based Knowledge Base (LKB) is proposed.
Finally, we apply the Conditional Generative adversarial networks-based channel Estimation (CGE) to obtain Channel State Information (CSI)
arXiv Detail & Related papers (2023-09-03T19:24:34Z) - SACHA: Soft Actor-Critic with Heuristic-Based Attention for Partially
Observable Multi-Agent Path Finding [3.4260993997836753]
We propose a novel multi-agent actor-critic method called Soft Actor-Critic with Heuristic-Based Attention (SACHA)
SACHA learns a neural network for each agent to selectively pay attention to the shortest path guidance from multiple agents within its field of view.
We demonstrate decent improvements over several state-of-the-art learning-based MAPF methods with respect to success rate and solution quality.
arXiv Detail & Related papers (2023-07-05T23:36:33Z) - Optimization for Master-UAV-powered Auxiliary-Aerial-IRS-assisted IoT
Networks: An Option-based Multi-agent Hierarchical Deep Reinforcement
Learning Approach [56.84948632954274]
This paper investigates a master unmanned aerial vehicle (MUAV)-powered Internet of Things (IoT) network.
We propose using a rechargeable auxiliary UAV (AUAV) equipped with an intelligent reflecting surface (IRS) to enhance the communication signals from the MUAV.
Under the proposed model, we investigate the optimal collaboration strategy of these energy-limited UAVs to maximize the accumulated throughput of the IoT network.
arXiv Detail & Related papers (2021-12-20T15:45:28Z) - Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems.
Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC.
We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z) - Deep Multi-Task Learning for Cooperative NOMA: System Design and
Principles [52.79089414630366]
We develop a novel deep cooperative NOMA scheme, drawing upon the recent advances in deep learning (DL)
We develop a novel hybrid-cascaded deep neural network (DNN) architecture such that the entire system can be optimized in a holistic manner.
arXiv Detail & Related papers (2020-07-27T12:38:37Z) - F2A2: Flexible Fully-decentralized Approximate Actor-critic for
Cooperative Multi-agent Reinforcement Learning [110.35516334788687]
Decentralized multi-agent reinforcement learning algorithms are sometimes unpractical in complicated applications.
We propose a flexible fully decentralized actor-critic MARL framework, which can handle large-scale general cooperative multi-agent setting.
Our framework can achieve scalability and stability for large-scale environment and reduce information transmission.
arXiv Detail & Related papers (2020-04-17T14:56:29Z) - FACMAC: Factored Multi-Agent Centralised Policy Gradients [103.30380537282517]
We propose FACtored Multi-Agent Centralised policy gradients (FACMAC)
It is a new method for cooperative multi-agent reinforcement learning in both discrete and continuous action spaces.
We evaluate FACMAC on variants of the multi-agent particle environments, a novel multi-agent MuJoCo benchmark, and a challenging set of StarCraft II micromanagement tasks.
arXiv Detail & Related papers (2020-03-14T21:29:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.