An Introduction to Decentralized Training and Execution in Cooperative Multi-Agent Reinforcement Learning
- URL: http://arxiv.org/abs/2405.06161v3
- Date: Mon, 19 Aug 2024 19:02:30 GMT
- Title: An Introduction to Decentralized Training and Execution in Cooperative Multi-Agent Reinforcement Learning
- Authors: Christopher Amato,
- Abstract summary: Multi-agent reinforcement learning (MARL) has exploded in popularity in recent years.
Decentralized training and execution methods make the fewest assumptions and are often simple to implement.
This text is an introduction to the field of decentralized, cooperative MARL.
- Score: 14.873907857806358
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-agent reinforcement learning (MARL) has exploded in popularity in recent years. Many approaches have been developed but they can be divided into three main types: centralized training and execution (CTE), centralized training for decentralized execution (CTDE), and Decentralized training and execution (DTE). Decentralized training and execution methods make the fewest assumptions and are often simple to implement. In fact, as I'll discuss, any single-agent RL method can be used for DTE by just letting each agent learn separately. Of course, there are pros and cons to such approaches. It is worth noting that DTE is required if no offline coordination is available. That is, if all agents must learn during online interactions without prior coordination, learning and execution must both be decentralized. DTE methods can be applied in cooperative, competitive, or mixed cases but this text will focus on the cooperative MARL case. This text is an introduction to the field of decentralized, cooperative MARL. As such, I will first give a brief description of the cooperative MARL problem in the form of the Dec-POMDP. Then, I will discuss value-based DTE methods starting with independent Q-learning and its extensions and then discuss the extension to the deep case with DQN, the additional complications this causes, and methods that have been developed to (attempt to) address these issues. Next, I will discuss policy gradient DTE methods starting with independent REINFORCE (i.e., vanilla policy gradient), and then extending to the actor-critic case and deep variants (such as independent PPO). Finally, I will discuss some general topics related to DTE and future directions.
Related papers
- An Introduction to Centralized Training for Decentralized Execution in Cooperative Multi-Agent Reinforcement Learning [14.873907857806358]
This text is an introduction to CTDE in cooperative MARL.
It is meant to explain the setting, basic concepts, and common methods.
arXiv Detail & Related papers (2024-09-04T19:54:40Z) - Communication-Efficient Decentralized Federated Learning via One-Bit
Compressive Sensing [52.402550431781805]
Decentralized federated learning (DFL) has gained popularity due to its practicality across various applications.
Compared to the centralized version, training a shared model among a large number of nodes in DFL is more challenging.
We develop a novel algorithm based on the framework of the inexact alternating direction method (iADM)
arXiv Detail & Related papers (2023-08-31T12:22:40Z) - Is Centralized Training with Decentralized Execution Framework
Centralized Enough for MARL? [27.037348104661497]
Training with Decentralized Execution is a popular framework for cooperative Multi-Agent Reinforcement Learning.
We introduce a novel Advising and Decentralized Pruning (CADP) framework for multi-agent reinforcement learning.
arXiv Detail & Related papers (2023-05-27T03:15:24Z) - MADiff: Offline Multi-agent Learning with Diffusion Models [79.18130544233794]
Diffusion model (DM) recently achieved huge success in various scenarios including offline reinforcement learning.
We propose MADiff, a novel generative multi-agent learning framework to tackle this problem.
Our experiments show the superior performance of MADiff compared to baseline algorithms in a wide range of multi-agent learning tasks.
arXiv Detail & Related papers (2023-05-27T02:14:09Z) - Dealing With Non-stationarity in Decentralized Cooperative Multi-Agent
Deep Reinforcement Learning via Multi-Timescale Learning [15.935860288840466]
Decentralized cooperative deep reinforcement learning (MARL) can be a versatile learning framework.
One of the critical challenges in decentralized deep MARL is the non-stationarity of the learning environment when multiple agents are learning concurrently.
We propose a decentralized cooperative MARL algorithm based on multi-timescale learning.
arXiv Detail & Related papers (2023-02-06T14:10:53Z) - ACE: Cooperative Multi-agent Q-learning with Bidirectional
Action-Dependency [65.28061634546577]
Multi-agent reinforcement learning (MARL) suffers from the non-stationarity problem.
In this paper, we propose bidirectional action-dependent Q-learning (ACE)
ACE outperforms the state-of-the-art algorithms on Google Research Football and StarCraft Multi-Agent Challenge.
arXiv Detail & Related papers (2022-11-29T10:22:55Z) - CTDS: Centralized Teacher with Decentralized Student for Multi-Agent
Reinforcement Learning [114.69155066932046]
This work proposes a novel.
Teacher with Decentralized Student (C TDS) framework, which consists of a teacher model and a student model.
Specifically, the teacher model allocates the team reward by learning individual Q-values conditioned on global observation.
The student model utilizes the partial observations to approximate the Q-values estimated by the teacher model.
arXiv Detail & Related papers (2022-03-16T06:03:14Z) - Mean-Field Multi-Agent Reinforcement Learning: A Decentralized Network
Approach [6.802025156985356]
This paper proposes a framework called localized training and decentralized execution to study MARL with network of states.
The key idea is to utilize the homogeneity of agents and regroup them according to their states, thus the formulation of a networked Markov decision process.
arXiv Detail & Related papers (2021-08-05T16:52:36Z) - Distributed Heuristic Multi-Agent Path Finding with Communication [7.854890646114447]
Multi-Agent Path Finding (MAPF) is essential to large-scale robotic systems.
Recent methods have applied reinforcement learning (RL) to learn decentralized polices in partially observable environments.
This paper combines communication with deep Q-learning to provide a novel learning based method for MAPF.
arXiv Detail & Related papers (2021-06-21T18:50:58Z) - Periodic Stochastic Gradient Descent with Momentum for Decentralized
Training [114.36410688552579]
We propose a novel periodic decentralized momentum SGD method, which employs the momentum schema and periodic communication for decentralized training.
We conduct extensive experiments to verify the performance of our proposed two methods, and both of them have shown superior performance over existing methods.
arXiv Detail & Related papers (2020-08-24T13:38:22Z) - F2A2: Flexible Fully-decentralized Approximate Actor-critic for
Cooperative Multi-agent Reinforcement Learning [110.35516334788687]
Decentralized multi-agent reinforcement learning algorithms are sometimes unpractical in complicated applications.
We propose a flexible fully decentralized actor-critic MARL framework, which can handle large-scale general cooperative multi-agent setting.
Our framework can achieve scalability and stability for large-scale environment and reduce information transmission.
arXiv Detail & Related papers (2020-04-17T14:56:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.