Multi-Agent Deep Reinforcement Learning Under Constrained Communications
- URL: http://arxiv.org/abs/2601.17069v1
- Date: Thu, 22 Jan 2026 21:07:18 GMT
- Title: Multi-Agent Deep Reinforcement Learning Under Constrained Communications
- Authors: Shahil Shaik, Jonathon M. Smereka, Yue Wang,
- Abstract summary: We present a distributed multi-agent reinforcement learning (MARL) framework that removes the need for centralized critics or global information.<n>We develop a novel Graph Attention Network (D-GAT) that performs global state inference through multi-hop communication.<n>We also develop the distributed graph-attention MAPPO (DG-MAPPO) -- a distributed MARL framework where agents optimize local policies and value functions.
- Score: 2.7126292487109005
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Centralized training with decentralized execution (CTDE) has been the dominant paradigm in multi-agent reinforcement learning (MARL), but its reliance on global state information during training introduces scalability, robustness, and generalization bottlenecks. Moreover, in practical scenarios such as adding/dropping teammates or facing environment dynamics that differ from the training, CTDE methods can be brittle and costly to retrain, whereas distributed approaches allow agents to adapt using only local information and peer-to-peer communication. We present a distributed MARL framework that removes the need for centralized critics or global information. Firstly, we develop a novel Distributed Graph Attention Network (D-GAT) that performs global state inference through multi-hop communication, where agents integrate neighbor features via input-dependent attention weights in a fully distributed manner. Leveraging D-GAT, we develop the distributed graph-attention MAPPO (DG-MAPPO) -- a distributed MARL framework where agents optimize local policies and value functions using local observations, multi-hop communication, and shared/averaged rewards. Empirical evaluation on the StarCraftII Multi-Agent Challenge, Google Research Football, and Multi-Agent Mujoco demonstrates that our method consistently outperforms strong CTDE baselines, achieving superior coordination across a wide range of cooperative tasks with both homogeneous and heterogeneous teams. Our distributed MARL framework provides a principled and scalable solution for robust collaboration, eliminating the need for centralized training or global observability. To the best of our knowledge, DG-MAPPO appears to be the first to fully eliminate reliance on privileged centralized information, enabling agents to learn and act solely through peer-to-peer communication.
Related papers
- Learning to Interact in World Latent for Team Coordination [53.51290193631586]
This work presents a novel representation learning framework, interactive world latent (IWoL), to facilitate team coordination in multi-agent reinforcement learning (MARL)<n>Our key insight is to construct a learnable representation space that jointly captures inter-agent relations and task-specific world information by directly modeling communication protocols.<n>Our representation can be used not only as an implicit latent for each agent, but also as an explicit message for communication.
arXiv Detail & Related papers (2025-09-29T22:13:39Z) - Bayesian Ego-graph inference for Networked Multi-Agent Reinforcement Learning [16.190458233440864]
We propose a graph-based policy for Networked-MARL, where each agent conditions its decision on a sampled subgraph over its local physical neighborhood.<n>We introduce BayesG, a decentralized actor-framework that learns sparse, context-aware interaction structures via Bayesian variational inference.<n>BayesG outperforms strong MARL baselines on large-scale traffic control tasks with up to 167 agents.
arXiv Detail & Related papers (2025-09-20T10:09:37Z) - Tacit Learning with Adaptive Information Selection for Cooperative Multi-Agent Reinforcement Learning [13.918498667158119]
We introduce a novel cooperative MARL framework based on information selection and tacit learning.<n>We integrate gating and selection mechanisms, allowing agents to adaptively filter information based on environmental changes.<n>Experiments on popular MARL benchmarks show that our framework can be seamlessly integrated with state-of-the-art algorithms.
arXiv Detail & Related papers (2024-12-20T07:55:59Z) - Collaborative Information Dissemination with Graph-based Multi-Agent
Reinforcement Learning [2.9904113489777826]
This paper introduces a Multi-Agent Reinforcement Learning (MARL) approach for efficient information dissemination.
We propose a Partially Observable Game (POSG) for information dissemination empowering each agent to decide on message forwarding independently.
Our experimental results show that our trained policies outperform existing methods.
arXiv Detail & Related papers (2023-08-25T21:30:16Z) - Is Centralized Training with Decentralized Execution Framework Centralized Enough for MARL? [34.00244359590573]
Training with Decentralized Execution is a popular framework for cooperative Multi-Agent Reinforcement Learning.<n>We introduce a novel Advising and Decentralized Pruning (CADP) framework for multi-agent reinforcement learning.
arXiv Detail & Related papers (2023-05-27T03:15:24Z) - Scalable Multi-Agent Model-Based Reinforcement Learning [1.95804735329484]
We propose a new method called MAMBA which utilizes Model-Based Reinforcement Learning (MBRL) to further leverage centralized training in cooperative environments.
We argue that communication between agents is enough to sustain a world model for each agent during execution phase while imaginary rollouts can be used for training, removing the necessity to interact with the environment.
arXiv Detail & Related papers (2022-05-25T08:35:00Z) - Locality Matters: A Scalable Value Decomposition Approach for
Cooperative Multi-Agent Reinforcement Learning [52.7873574425376]
Cooperative multi-agent reinforcement learning (MARL) faces significant scalability issues due to state and action spaces that are exponentially large in the number of agents.
We propose a novel, value-based multi-agent algorithm called LOMAQ, which incorporates local rewards in the Training Decentralized Execution paradigm.
arXiv Detail & Related papers (2021-09-22T10:08:15Z) - Mean-Field Multi-Agent Reinforcement Learning: A Decentralized Network
Approach [6.802025156985356]
This paper proposes a framework called localized training and decentralized execution to study MARL with network of states.
The key idea is to utilize the homogeneity of agents and regroup them according to their states, thus the formulation of a networked Markov decision process.
arXiv Detail & Related papers (2021-08-05T16:52:36Z) - Learning Connectivity for Data Distribution in Robot Teams [96.39864514115136]
We propose a task-agnostic, decentralized, low-latency method for data distribution in ad-hoc networks using Graph Neural Networks (GNN)
Our approach enables multi-agent algorithms based on global state information to function by ensuring it is available at each robot.
We train the distributed GNN communication policies via reinforcement learning using the average Age of Information as the reward function and show that it improves training stability compared to task-specific reward functions.
arXiv Detail & Related papers (2021-03-08T21:48:55Z) - Dif-MAML: Decentralized Multi-Agent Meta-Learning [54.39661018886268]
We propose a cooperative multi-agent meta-learning algorithm, referred to as MAML or Dif-MAML.
We show that the proposed strategy allows a collection of agents to attain agreement at a linear rate and to converge to a stationary point of the aggregate MAML.
Simulation results illustrate the theoretical findings and the superior performance relative to the traditional non-cooperative setting.
arXiv Detail & Related papers (2020-10-06T16:51:09Z) - F2A2: Flexible Fully-decentralized Approximate Actor-critic for
Cooperative Multi-agent Reinforcement Learning [110.35516334788687]
Decentralized multi-agent reinforcement learning algorithms are sometimes unpractical in complicated applications.
We propose a flexible fully decentralized actor-critic MARL framework, which can handle large-scale general cooperative multi-agent setting.
Our framework can achieve scalability and stability for large-scale environment and reduce information transmission.
arXiv Detail & Related papers (2020-04-17T14:56:29Z) - Monotonic Value Function Factorisation for Deep Multi-Agent
Reinforcement Learning [55.20040781688844]
QMIX is a novel value-based method that can train decentralised policies in a centralised end-to-end fashion.
We propose the StarCraft Multi-Agent Challenge (SMAC) as a new benchmark for deep multi-agent reinforcement learning.
arXiv Detail & Related papers (2020-03-19T16:51:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.