Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in
Cooperative Tasks
- URL: http://arxiv.org/abs/2006.07869v4
- Date: Tue, 9 Nov 2021 10:42:04 GMT
- Title: Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in
Cooperative Tasks
- Authors: Georgios Papoudakis, Filippos Christianos, Lukas Sch\"afer, Stefano V.
Albrecht
- Abstract summary: Multi-agent deep reinforcement learning (MARL) suffers from a lack of commonly-used evaluation tasks and criteria.
We provide a systematic evaluation and comparison of three different classes of MARL algorithms.
Our experiments serve as a reference for the expected performance of algorithms across different learning tasks.
- Score: 11.480994804659908
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-agent deep reinforcement learning (MARL) suffers from a lack of
commonly-used evaluation tasks and criteria, making comparisons between
approaches difficult. In this work, we provide a systematic evaluation and
comparison of three different classes of MARL algorithms (independent learning,
centralised multi-agent policy gradient, value decomposition) in a diverse
range of cooperative multi-agent learning tasks. Our experiments serve as a
reference for the expected performance of algorithms across different learning
tasks, and we provide insights regarding the effectiveness of different
learning approaches. We open-source EPyMARL, which extends the PyMARL codebase
to include additional algorithms and allow for flexible configuration of
algorithm implementation details such as parameter sharing. Finally, we
open-source two environments for multi-agent research which focus on
coordination under sparse rewards.
Related papers
- POGEMA: A Benchmark Platform for Cooperative Multi-Agent Navigation [76.67608003501479]
We introduce and specify an evaluation protocol defining a range of domain-related metrics computed on the basics of the primary evaluation indicators.
The results of such a comparison, which involves a variety of state-of-the-art MARL, search-based, and hybrid methods, are presented.
arXiv Detail & Related papers (2024-07-20T16:37:21Z) - Learning Reward Machines in Cooperative Multi-Agent Tasks [75.79805204646428]
This paper presents a novel approach to Multi-Agent Reinforcement Learning (MARL)
It combines cooperative task decomposition with the learning of reward machines (RMs) encoding the structure of the sub-tasks.
The proposed method helps deal with the non-Markovian nature of the rewards in partially observable environments.
arXiv Detail & Related papers (2023-03-24T15:12:28Z) - A Variational Approach to Mutual Information-Based Coordination for
Multi-Agent Reinforcement Learning [17.893310647034188]
We propose a new mutual information framework for multi-agent reinforcement learning.
Applying policy to maximize the derived lower bound, we propose a practical algorithm named variational maximum mutual information multi-agent actor-critic.
arXiv Detail & Related papers (2023-03-01T12:21:30Z) - Revisiting Some Common Practices in Cooperative Multi-Agent
Reinforcement Learning [11.91425153754564]
We show that in environments with a highly multi-modal reward landscape, value decomposition, and parameter sharing can be problematic and lead to undesired outcomes.
In contrast, policy gradient (PG) methods with individual policies provably converge to an optimal solution in these cases.
We present practical suggestions on implementing multi-agent PG algorithms for either high rewards or diverse emergent behaviors.
arXiv Detail & Related papers (2022-06-15T13:03:05Z) - Policy Diagnosis via Measuring Role Diversity in Cooperative Multi-agent
RL [107.58821842920393]
We quantify the agent's behavior difference and build its relationship with the policy performance via bf Role Diversity
We find that the error bound in MARL can be decomposed into three parts that have a strong relation to the role diversity.
The decomposed factors can significantly impact policy optimization on three popular directions.
arXiv Detail & Related papers (2022-06-01T04:58:52Z) - The Multi-Agent Pickup and Delivery Problem: MAPF, MARL and Its
Warehouse Applications [2.969705152497174]
We study two state-of-the-art solutions to the multi-agent pickup and delivery problem based on different principles.
Specifically, a recent MAPF algorithm called conflict-based search (CBS) and a current MARL algorithm called shared experience actor-critic (SEAC) are studied.
arXiv Detail & Related papers (2022-03-14T13:23:35Z) - Meta Navigator: Search for a Good Adaptation Policy for Few-shot
Learning [113.05118113697111]
Few-shot learning aims to adapt knowledge learned from previous tasks to novel tasks with only a limited amount of labeled data.
Research literature on few-shot learning exhibits great diversity, while different algorithms often excel at different few-shot learning scenarios.
We present Meta Navigator, a framework that attempts to solve the limitation in few-shot learning by seeking a higher-level strategy.
arXiv Detail & Related papers (2021-09-13T07:20:01Z) - Survey of Recent Multi-Agent Reinforcement Learning Algorithms Utilizing
Centralized Training [0.7588690078299698]
We discuss variations of centralized training and describe a recent survey of algorithmic approaches.
The goal is to explore how different implementations of information sharing mechanism in centralized learning may give rise to distinct group coordinated behaviors.
arXiv Detail & Related papers (2021-07-29T20:29:12Z) - Softmax with Regularization: Better Value Estimation in Multi-Agent
Reinforcement Learning [72.28520951105207]
Overestimation in $Q$-learning is an important problem that has been extensively studied in single-agent reinforcement learning.
We propose a novel regularization-based update scheme that penalizes large joint action-values deviating from a baseline.
We show that our method provides a consistent performance improvement on a set of challenging StarCraft II micromanagement tasks.
arXiv Detail & Related papers (2021-03-22T14:18:39Z) - UneVEn: Universal Value Exploration for Multi-Agent Reinforcement
Learning [53.73686229912562]
We propose a novel MARL approach called Universal Value Exploration (UneVEn)
UneVEn learns a set of related tasks simultaneously with a linear decomposition of universal successor features.
Empirical results on a set of exploration games, challenging cooperative predator-prey tasks requiring significant coordination among agents, and StarCraft II micromanagement benchmarks show that UneVEn can solve tasks where other state-of-the-art MARL methods fail.
arXiv Detail & Related papers (2020-10-06T19:08:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.