Related papers: Reducing Variance Caused by Communication in Decentralized Multi-agent Deep Reinforcement Learning

Reducing Variance Caused by Communication in Decentralized Multi-agent Deep Reinforcement Learning

URL: http://arxiv.org/abs/2502.06261v1
Date: Mon, 10 Feb 2025 08:53:13 GMT
Title: Reducing Variance Caused by Communication in Decentralized Multi-agent Deep Reinforcement Learning
Authors: Changxi Zhu, Mehdi Dastani, Shihan Wang,
Abstract summary: We study the variance that is caused by communication in policy gradients.<n>We propose modular techniques to reduce the variance in policy gradients during training.<n>The results show that decentralized MADRL communication methods extended with our proposed techniques.
Score: 2.1461517065527445
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In decentralized multi-agent deep reinforcement learning (MADRL), communication can help agents to gain a better understanding of the environment to better coordinate their behaviors. Nevertheless, communication may involve uncertainty, which potentially introduces variance to the learning of decentralized agents. In this paper, we focus on a specific decentralized MADRL setting with communication and conduct a theoretical analysis to study the variance that is caused by communication in policy gradients. We propose modular techniques to reduce the variance in policy gradients during training. We adopt our modular techniques into two existing algorithms for decentralized MADRL with communication and evaluate them on multiple tasks in the StarCraft Multi-Agent Challenge and Traffic Junction domains. The results show that decentralized MADRL communication methods extended with our proposed techniques not only achieve high-performing agents but also reduce variance in policy gradients during training.

Related papers

Distributed and Decentralised Training: Technical Governance Challenges in a Shifting AI Landscape [1.6590638305972631]
Low-communication training algorithms are enabling a shift from centralised model training to compute setups that are either distributed across multiple clusters or decentralised via community-driven contributions.<n>This paper distinguishes these two scenarios - distributed and decentralised training - which are little understood and often conflated in policy discourse.<n>We discuss how they could impact technical AI governance through an increased risk of compute structuring, capability proliferation, and the erosion of detectability and shutdownability.
arXiv Detail & Related papers (2025-07-10T13:43:15Z)
Fully-Decentralized MADDPG with Networked Agents [0.5266869303483376]
We adapt the MADDPG algorithm by applying a networked communication approach between agents. We introduce surrogate policies in order to decentralize the training while allowing for local communication during training. The decentralized algorithms achieve comparable results to the original MADDPG in empirical tests, while reducing computational cost.
arXiv Detail & Related papers (2025-03-09T20:05:32Z)
Decentralized Learning Strategies for Estimation Error Minimization with Graph Neural Networks [94.2860766709971]
We address the challenge of sampling and remote estimation for autoregressive Markovian processes in a wireless network with statistically-identical agents. Our goal is to minimize time-average estimation error and/or age of information with decentralized scalable sampling and transmission policies.
arXiv Detail & Related papers (2024-04-04T06:24:11Z)
Imitation Learning based Alternative Multi-Agent Proximal Policy Optimization for Well-Formed Swarm-Oriented Pursuit Avoidance [15.498559530889839]
In this paper, we put forward a decentralized learning based Alternative Multi-Agent Proximal Policy Optimization (IA-MAPPO) algorithm to execute the pursuit avoidance task in well-formed swarm. We utilize imitation learning to decentralize the formation controller, so as to reduce the communication overheads and enhance the scalability. The simulation results validate the effectiveness of IA-MAPPO and extensive ablation experiments further show the performance comparable to a centralized solution with significant decrease in communication overheads.
arXiv Detail & Related papers (2023-11-06T06:58:16Z)
Multi-Agent Reinforcement Learning-Based UAV Pathfinding for Obstacle Avoidance in Stochastic Environment [12.122881147337505]
We propose a novel centralized training with decentralized execution method based on multi-agent reinforcement learning. In our approach, agents communicate only with the centralized planner to make decentralized decisions online. We conduct multi-step value convergence in multi-agent reinforcement learning to enhance the training efficiency.
arXiv Detail & Related papers (2023-10-25T14:21:22Z)
Collaborative Information Dissemination with Graph-based Multi-Agent Reinforcement Learning [2.9904113489777826]
This paper introduces a Multi-Agent Reinforcement Learning (MARL) approach for efficient information dissemination. We propose a Partially Observable Game (POSG) for information dissemination empowering each agent to decide on message forwarding independently. Our experimental results show that our trained policies outperform existing methods.
arXiv Detail & Related papers (2023-08-25T21:30:16Z)
Networked Communication for Decentralised Agents in Mean-Field Games [59.01527054553122]
We introduce networked communication to the mean-field game framework. We prove that our architecture has sample guarantees bounded between those of the centralised- and independent-learning cases.
arXiv Detail & Related papers (2023-06-05T10:45:39Z)
MADiff: Offline Multi-agent Learning with Diffusion Models [79.18130544233794]
MADiff is a diffusion-based multi-agent learning framework.<n>It works as both a decentralized policy and a centralized controller.<n>Our experiments demonstrate that MADiff outperforms baseline algorithms across various multi-agent learning tasks.
arXiv Detail & Related papers (2023-05-27T02:14:09Z)
Decentralized Learning over Wireless Networks: The Effect of Broadcast with Random Access [56.91063444859008]
We investigate the impact of broadcast transmission and probabilistic random access policy on the convergence performance of D-SGD. Our results demonstrate that optimizing the access probability to maximize the expected number of successful links is a highly effective strategy for accelerating the system convergence.
arXiv Detail & Related papers (2023-05-12T10:32:26Z)
Network Slicing via Transfer Learning aided Distributed Deep Reinforcement Learning [7.126310378721161]
We propose a novel transfer learning (TL) aided multi-agent deep reinforcement learning (MADRL) approach with inter-agent similarity analysis for inter-cell inter-slice resource partitioning. We show that our approach outperforms the state-of-the-art solutions in terms of performance, convergence speed and sample efficiency.
arXiv Detail & Related papers (2023-01-09T10:55:13Z)
Depthwise Convolution for Multi-Agent Communication with Enhanced Mean-Field Approximation [9.854975702211165]
We propose a new method based on local communication learning to tackle the multi-agent RL (MARL) challenge. First, we design a new communication protocol that exploits the ability of depthwise convolution to efficiently extract local relations. Second, we introduce the mean-field approximation into our method to reduce the scale of agent interactions.
arXiv Detail & Related papers (2022-03-06T07:42:43Z)
Decentralized Local Stochastic Extra-Gradient for Variational Inequalities [125.62877849447729]
We consider distributed variational inequalities (VIs) on domains with the problem data that is heterogeneous (non-IID) and distributed across many devices. We make a very general assumption on the computational network that covers the settings of fully decentralized calculations. We theoretically analyze its convergence rate in the strongly-monotone, monotone, and non-monotone settings.
arXiv Detail & Related papers (2021-06-15T17:45:51Z)
Dif-MAML: Decentralized Multi-Agent Meta-Learning [54.39661018886268]
We propose a cooperative multi-agent meta-learning algorithm, referred to as MAML or Dif-MAML. We show that the proposed strategy allows a collection of agents to attain agreement at a linear rate and to converge to a stationary point of the aggregate MAML. Simulation results illustrate the theoretical findings and the superior performance relative to the traditional non-cooperative setting.
arXiv Detail & Related papers (2020-10-06T16:51:09Z)
F2A2: Flexible Fully-decentralized Approximate Actor-critic for Cooperative Multi-agent Reinforcement Learning [110.35516334788687]
Decentralized multi-agent reinforcement learning algorithms are sometimes unpractical in complicated applications. We propose a flexible fully decentralized actor-critic MARL framework, which can handle large-scale general cooperative multi-agent setting. Our framework can achieve scalability and stability for large-scale environment and reduce information transmission.
arXiv Detail & Related papers (2020-04-17T14:56:29Z)
Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning [55.20040781688844]
QMIX is a novel value-based method that can train decentralised policies in a centralised end-to-end fashion. We propose the StarCraft Multi-Agent Challenge (SMAC) as a new benchmark for deep multi-agent reinforcement learning.
arXiv Detail & Related papers (2020-03-19T16:51:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.