Related papers: A First Introduction to Cooperative Multi-Agent Reinforcement Learning

A First Introduction to Cooperative Multi-Agent Reinforcement Learning

URL: http://arxiv.org/abs/2405.06161v4
Date: Thu, 19 Dec 2024 19:51:49 GMT
Title: A First Introduction to Cooperative Multi-Agent Reinforcement Learning
Authors: Christopher Amato,
Abstract summary: Multi-agent reinforcement learning (MARL) has exploded in popularity in recent years.<n>MARL approaches can be broadly categorized into three main types: centralized training and execution (CTE), centralized training for decentralized execution (CTDE), and decentralized training and execution (DTE)<n>This text is an introduction to cooperative MARL -- MARL in which all agents share a single, joint reward.
Score: 14.873907857806358
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Multi-agent reinforcement learning (MARL) has exploded in popularity in recent years. While numerous approaches have been developed, they can be broadly categorized into three main types: centralized training and execution (CTE), centralized training for decentralized execution (CTDE), and decentralized training and execution (DTE). CTE methods assume centralization during training and execution (e.g., with fast, free, and perfect communication) and have the most information during execution. CTDE methods are the most common, as they leverage centralized information during training while enabling decentralized execution -- using only information available to that agent during execution. Decentralized training and execution methods make the fewest assumptions and are often simple to implement. This text is an introduction to cooperative MARL -- MARL in which all agents share a single, joint reward. It is meant to explain the setting, basic concepts, and common methods for the CTE, CTDE, and DTE settings. It does not cover all work in cooperative MARL as the area is quite extensive. I have included work that I believe is important for understanding the main concepts in the area and apologize to those that I have omitted. Topics include simple applications of single-agent methods to CTE as well as some more scalable methods that exploit the multi-agent structure, independent Q-learning and policy gradient methods and their extensions, as well as value function factorization methods including the well-known VDN, QMIX, and QPLEX approaches, abd centralized critic methods including MADDPG, COMA, and MAPPO. I also discuss common misconceptions, the relationship between different approaches, and some open questions.

Related papers

Multi-Agent Guided Policy Optimization [36.853129816484845]
Training with Decentralized Execution (CTDE) has become the dominant paradigm in cooperative Multi-Agent Reinforcement Learning (MARL)<n>We propose Multi-Agent Guided Policy Optimization (MAGPO), a novel framework that better leverages centralized training by integrating centralized guidance with decentralized execution.
arXiv Detail & Related papers (2025-07-24T03:22:21Z)
An Introduction to Centralized Training for Decentralized Execution in Cooperative Multi-Agent Reinforcement Learning [14.873907857806358]
This text is an introduction to CTDE in cooperative MARL. It is meant to explain the setting, basic concepts, and common methods.
arXiv Detail & Related papers (2024-09-04T19:54:40Z)
Communication-Efficient Decentralized Federated Learning via One-Bit Compressive Sensing [52.402550431781805]
Decentralized federated learning (DFL) has gained popularity due to its practicality across various applications. Compared to the centralized version, training a shared model among a large number of nodes in DFL is more challenging. We develop a novel algorithm based on the framework of the inexact alternating direction method (iADM)
arXiv Detail & Related papers (2023-08-31T12:22:40Z)
Is Centralized Training with Decentralized Execution Framework Centralized Enough for MARL? [27.037348104661497]
Training with Decentralized Execution is a popular framework for cooperative Multi-Agent Reinforcement Learning. We introduce a novel Advising and Decentralized Pruning (CADP) framework for multi-agent reinforcement learning.
arXiv Detail & Related papers (2023-05-27T03:15:24Z)
MADiff: Offline Multi-agent Learning with Diffusion Models [79.18130544233794]
Diffusion model (DM) recently achieved huge success in various scenarios including offline reinforcement learning. We propose MADiff, a novel generative multi-agent learning framework to tackle this problem. Our experiments show the superior performance of MADiff compared to baseline algorithms in a wide range of multi-agent learning tasks.
arXiv Detail & Related papers (2023-05-27T02:14:09Z)
Dealing With Non-stationarity in Decentralized Cooperative Multi-Agent Deep Reinforcement Learning via Multi-Timescale Learning [15.935860288840466]
Decentralized cooperative deep reinforcement learning (MARL) can be a versatile learning framework. One of the critical challenges in decentralized deep MARL is the non-stationarity of the learning environment when multiple agents are learning concurrently. We propose a decentralized cooperative MARL algorithm based on multi-timescale learning.
arXiv Detail & Related papers (2023-02-06T14:10:53Z)
ACE: Cooperative Multi-agent Q-learning with Bidirectional Action-Dependency [65.28061634546577]
Multi-agent reinforcement learning (MARL) suffers from the non-stationarity problem. In this paper, we propose bidirectional action-dependent Q-learning (ACE) ACE outperforms the state-of-the-art algorithms on Google Research Football and StarCraft Multi-Agent Challenge.
arXiv Detail & Related papers (2022-11-29T10:22:55Z)
RACA: Relation-Aware Credit Assignment for Ad-Hoc Cooperation in Multi-Agent Deep Reinforcement Learning [55.55009081609396]
We propose a novel method, called Relation-Aware Credit Assignment (RACA), which achieves zero-shot generalization in ad-hoc cooperation scenarios. RACA takes advantage of a graph-based encoder relation to encode the topological structure between agents. Our method outperforms baseline methods on the StarCraftII micromanagement benchmark and ad-hoc cooperation scenarios.
arXiv Detail & Related papers (2022-06-02T03:39:27Z)
CTDS: Centralized Teacher with Decentralized Student for Multi-Agent Reinforcement Learning [114.69155066932046]
This work proposes a novel. Teacher with Decentralized Student (C TDS) framework, which consists of a teacher model and a student model. Specifically, the teacher model allocates the team reward by learning individual Q-values conditioned on global observation. The student model utilizes the partial observations to approximate the Q-values estimated by the teacher model.
arXiv Detail & Related papers (2022-03-16T06:03:14Z)
Mean-Field Multi-Agent Reinforcement Learning: A Decentralized Network Approach [6.802025156985356]
This paper proposes a framework called localized training and decentralized execution to study MARL with network of states. The key idea is to utilize the homogeneity of agents and regroup them according to their states, thus the formulation of a networked Markov decision process.
arXiv Detail & Related papers (2021-08-05T16:52:36Z)
Distributed Heuristic Multi-Agent Path Finding with Communication [7.854890646114447]
Multi-Agent Path Finding (MAPF) is essential to large-scale robotic systems. Recent methods have applied reinforcement learning (RL) to learn decentralized polices in partially observable environments. This paper combines communication with deep Q-learning to provide a novel learning based method for MAPF.
arXiv Detail & Related papers (2021-06-21T18:50:58Z)
Dif-MAML: Decentralized Multi-Agent Meta-Learning [54.39661018886268]
We propose a cooperative multi-agent meta-learning algorithm, referred to as MAML or Dif-MAML. We show that the proposed strategy allows a collection of agents to attain agreement at a linear rate and to converge to a stationary point of the aggregate MAML. Simulation results illustrate the theoretical findings and the superior performance relative to the traditional non-cooperative setting.
arXiv Detail & Related papers (2020-10-06T16:51:09Z)
Periodic Stochastic Gradient Descent with Momentum for Decentralized Training [114.36410688552579]
We propose a novel periodic decentralized momentum SGD method, which employs the momentum schema and periodic communication for decentralized training. We conduct extensive experiments to verify the performance of our proposed two methods, and both of them have shown superior performance over existing methods.
arXiv Detail & Related papers (2020-08-24T13:38:22Z)
F2A2: Flexible Fully-decentralized Approximate Actor-critic for Cooperative Multi-agent Reinforcement Learning [110.35516334788687]
Decentralized multi-agent reinforcement learning algorithms are sometimes unpractical in complicated applications. We propose a flexible fully decentralized actor-critic MARL framework, which can handle large-scale general cooperative multi-agent setting. Our framework can achieve scalability and stability for large-scale environment and reduce information transmission.
arXiv Detail & Related papers (2020-04-17T14:56:29Z)
Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning [55.20040781688844]
QMIX is a novel value-based method that can train decentralised policies in a centralised end-to-end fashion. We propose the StarCraft Multi-Agent Challenge (SMAC) as a new benchmark for deep multi-agent reinforcement learning.
arXiv Detail & Related papers (2020-03-19T16:51:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.