Multi-agent deep reinforcement learning (MADRL) meets multi-user MIMO
systems
- URL: http://arxiv.org/abs/2109.04986v1
- Date: Fri, 10 Sep 2021 16:50:45 GMT
- Title: Multi-agent deep reinforcement learning (MADRL) meets multi-user MIMO
systems
- Authors: Heunchul Lee, Jaeseong Jeong
- Abstract summary: We present a MADRL-based approach that can jointly optimize precoders to achieve the outer-boundary, called pareto-boundary, of the achievable rate region.
We will also address a phase ambiguity issue with the conventional complex baseband representation of signals widely used in radio communications.
To the best of our knowledge, this is the first work to demonstrate that the MA-DDPG framework can jointly optimize precoders to achieve the pareto-boundary of achievable rate region.
- Score: 0.3883460584034765
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: A multi-agent deep reinforcement learning (MADRL) is a promising approach to
challenging problems in wireless environments involving multiple
decision-makers (or actors) with high-dimensional continuous action space. In
this paper, we present a MADRL-based approach that can jointly optimize
precoders to achieve the outer-boundary, called pareto-boundary, of the
achievable rate region for a multiple-input single-output (MISO) interference
channel (IFC). In order to address two main challenges, namely, multiple actors
(or agents) with partial observability and multi-dimensional continuous action
space in MISO IFC setup, we adopt a multi-agent deep deterministic policy
gradient (MA-DDPG) framework in which decentralized actors with partial
observability can learn a multi-dimensional continuous policy in a centralized
manner with the aid of shared critic with global information. Meanwhile, we
will also address a phase ambiguity issue with the conventional complex
baseband representation of signals widely used in radio communications. In
order to mitigate the impact of phase ambiguity on training performance, we
propose a training method, called phase ambiguity elimination (PAE), that leads
to faster learning and better performance of MA-DDPG in wireless communication
systems. The simulation results exhibit that MA-DDPG is capable of learning a
near-optimal precoding strategy in a MISO IFC environment. To the best of our
knowledge, this is the first work to demonstrate that the MA-DDPG framework can
jointly optimize precoders to achieve the pareto-boundary of achievable rate
region in a multi-cell multi-user multi-antenna system.
Related papers
- MFC-EQ: Mean-Field Control with Envelope Q-Learning for Moving Decentralized Agents in Formation [1.770056709115081]
Moving Agents in Formation (MAiF) is a variant of Multi-Agent Path Finding.
MFC-EQ is a scalable and adaptable learning framework for this bi-objective multi-agent problem.
arXiv Detail & Related papers (2024-10-15T20:59:47Z) - Design Optimization of NOMA Aided Multi-STAR-RIS for Indoor Environments: A Convex Approximation Imitated Reinforcement Learning Approach [51.63921041249406]
Non-orthogonal multiple access (NOMA) enables multiple users to share the same frequency band, and simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS)
deploying STAR-RIS indoors presents challenges in interference mitigation, power consumption, and real-time configuration.
A novel network architecture utilizing multiple access points (APs), STAR-RISs, and NOMA is proposed for indoor communication.
arXiv Detail & Related papers (2024-06-19T07:17:04Z) - Ensembling Prioritized Hybrid Policies for Multi-agent Pathfinding [18.06081009550052]
Multi-Agent Reinforcement Learning (MARL) based Multi-Agent Path Finding (MAPF) has recently gained attention due to its efficiency and scalability.
Several MARL-MAPF methods choose to use communication to enrich the information one agent can perceive.
We propose a new method, Ensembling Prioritized Hybrid Policies (EPH)
arXiv Detail & Related papers (2024-03-12T11:47:12Z) - HiMAP: Learning Heuristics-Informed Policies for Large-Scale Multi-Agent
Pathfinding [16.36594480478895]
Heuristics-Informed Multi-Agent Pathfinding (HiMAP)
Heuristics-Informed Multi-Agent Pathfinding (HiMAP)
arXiv Detail & Related papers (2024-02-23T13:01:13Z) - Effective Multi-Agent Deep Reinforcement Learning Control with Relative
Entropy Regularization [6.441951360534903]
Multi-Agent Continuous Dynamic Policy Gradient (MACDPP) was proposed to tackle the issues of limited capability and sample efficiency in various scenarios controlled by multiple agents.
It alleviates the inconsistency of multiple agents' policy updates by introducing the relative entropy regularization to the Training with Decentralized Execution (CTDE) framework with the Actor-Critic (AC) structure.
arXiv Detail & Related papers (2023-09-26T07:38:19Z) - Large AI Model Empowered Multimodal Semantic Communications [48.73159237649128]
We propose a Large AI Model-based Multimodal SC (LAMMSC) framework.
We first present the Conditional-based Multimodal Alignment (MMA) that enables the transformation between multimodal and unimodal data.
Then, a personalized LLM-based Knowledge Base (LKB) is proposed, which allows users to perform personalized semantic extraction or recovery.
Finally, we apply the Generative adversarial network-based channel Estimation (CGE) for estimating the wireless channel state information.
arXiv Detail & Related papers (2023-09-03T19:24:34Z) - Optimization for Master-UAV-powered Auxiliary-Aerial-IRS-assisted IoT
Networks: An Option-based Multi-agent Hierarchical Deep Reinforcement
Learning Approach [56.84948632954274]
This paper investigates a master unmanned aerial vehicle (MUAV)-powered Internet of Things (IoT) network.
We propose using a rechargeable auxiliary UAV (AUAV) equipped with an intelligent reflecting surface (IRS) to enhance the communication signals from the MUAV.
Under the proposed model, we investigate the optimal collaboration strategy of these energy-limited UAVs to maximize the accumulated throughput of the IoT network.
arXiv Detail & Related papers (2021-12-20T15:45:28Z) - Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems.
Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC.
We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z) - F2A2: Flexible Fully-decentralized Approximate Actor-critic for
Cooperative Multi-agent Reinforcement Learning [110.35516334788687]
Decentralized multi-agent reinforcement learning algorithms are sometimes unpractical in complicated applications.
We propose a flexible fully decentralized actor-critic MARL framework, which can handle large-scale general cooperative multi-agent setting.
Our framework can achieve scalability and stability for large-scale environment and reduce information transmission.
arXiv Detail & Related papers (2020-04-17T14:56:29Z) - FACMAC: Factored Multi-Agent Centralised Policy Gradients [103.30380537282517]
We propose FACtored Multi-Agent Centralised policy gradients (FACMAC)
It is a new method for cooperative multi-agent reinforcement learning in both discrete and continuous action spaces.
We evaluate FACMAC on variants of the multi-agent particle environments, a novel multi-agent MuJoCo benchmark, and a challenging set of StarCraft II micromanagement tasks.
arXiv Detail & Related papers (2020-03-14T21:29:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.