Related papers: Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks

Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks

URL: http://arxiv.org/abs/2006.07869v4
Date: Tue, 9 Nov 2021 10:42:04 GMT
Title: Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks
Authors: Georgios Papoudakis, Filippos Christianos, Lukas Sch\"afer, Stefano V. Albrecht
Abstract summary: Multi-agent deep reinforcement learning (MARL) suffers from a lack of commonly-used evaluation tasks and criteria. We provide a systematic evaluation and comparison of three different classes of MARL algorithms. Our experiments serve as a reference for the expected performance of algorithms across different learning tasks.
Score: 11.480994804659908
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multi-agent deep reinforcement learning (MARL) suffers from a lack of commonly-used evaluation tasks and criteria, making comparisons between approaches difficult. In this work, we provide a systematic evaluation and comparison of three different classes of MARL algorithms (independent learning, centralised multi-agent policy gradient, value decomposition) in a diverse range of cooperative multi-agent learning tasks. Our experiments serve as a reference for the expected performance of algorithms across different learning tasks, and we provide insights regarding the effectiveness of different learning approaches. We open-source EPyMARL, which extends the PyMARL codebase to include additional algorithms and allow for flexible configuration of algorithm implementation details such as parameter sharing. Finally, we open-source two environments for multi-agent research which focus on coordination under sparse rewards.

Related papers

Collab: Controlled Decoding using Mixture of Agents for LLM Alignment [90.6117569025754]
Reinforcement learning from human feedback has emerged as an effective technique to align Large Language models. Controlled Decoding provides a mechanism for aligning a model at inference time without retraining. We propose a mixture of agent-based decoding strategies leveraging the existing off-the-shelf aligned LLM policies.
arXiv Detail & Related papers (2025-03-27T17:34:25Z)
An Extended Benchmarking of Multi-Agent Reinforcement Learning Algorithms in Complex Fully Cooperative Tasks [0.0]
Multi-Agent Reinforcement Learning (MARL) has recently emerged as a significant area of research. MARL evaluation often lacks systematic diversity, hindering a comprehensive understanding of algorithms' capabilities.
arXiv Detail & Related papers (2025-02-07T09:17:02Z)
O-MAPL: Offline Multi-agent Preference Learning [5.4482836906033585]
Inferring reward functions from demonstrations is a key challenge in reinforcement learning (RL) We introduce a novel end-to-end preference-based learning framework for cooperative MARL. Our algorithm outperforms existing methods across various tasks.
arXiv Detail & Related papers (2025-01-31T08:08:20Z)
Exploring Multi-Agent Reinforcement Learning for Unrelated Parallel Machine Scheduling [2.3034630097498883]
The study introduces the Reinforcement Learning environment and conducts empirical analyses. The experiments employ various deep neural network policies for single- and Multi-Agent approaches. While Single-Agent algorithms perform adequately in reduced scenarios, Multi-Agent approaches reveal challenges in cooperative learning but a scalable capacity.
arXiv Detail & Related papers (2024-11-12T08:27:27Z)
POGEMA: A Benchmark Platform for Cooperative Multi-Agent Navigation [76.67608003501479]
We introduce and specify an evaluation protocol defining a range of domain-related metrics computed on the basics of the primary evaluation indicators. The results of such a comparison, which involves a variety of state-of-the-art MARL, search-based, and hybrid methods, are presented.
arXiv Detail & Related papers (2024-07-20T16:37:21Z)
Variational Offline Multi-agent Skill Discovery [47.924414207796005]
We propose two novel auto-encoder schemes to simultaneously capture subgroup- and temporal-level abstractions and form multi-agent skills. Our method can be applied to offline multi-task data, and the discovered subgroup skills can be transferred across relevant tasks without retraining. Empirical evaluations on StarCraft tasks indicate that our approach significantly outperforms existing hierarchical multi-agent reinforcement learning (MARL) methods.
arXiv Detail & Related papers (2024-05-26T00:24:46Z)
Learning Reward Machines in Cooperative Multi-Agent Tasks [75.79805204646428]
This paper presents a novel approach to Multi-Agent Reinforcement Learning (MARL) It combines cooperative task decomposition with the learning of reward machines (RMs) encoding the structure of the sub-tasks. The proposed method helps deal with the non-Markovian nature of the rewards in partially observable environments.
arXiv Detail & Related papers (2023-03-24T15:12:28Z)
A Variational Approach to Mutual Information-Based Coordination for Multi-Agent Reinforcement Learning [17.893310647034188]
We propose a new mutual information framework for multi-agent reinforcement learning. Applying policy to maximize the derived lower bound, we propose a practical algorithm named variational maximum mutual information multi-agent actor-critic.
arXiv Detail & Related papers (2023-03-01T12:21:30Z)
Policy Diagnosis via Measuring Role Diversity in Cooperative Multi-agent RL [107.58821842920393]
We quantify the agent's behavior difference and build its relationship with the policy performance via bf Role Diversity We find that the error bound in MARL can be decomposed into three parts that have a strong relation to the role diversity. The decomposed factors can significantly impact policy optimization on three popular directions.
arXiv Detail & Related papers (2022-06-01T04:58:52Z)
The Multi-Agent Pickup and Delivery Problem: MAPF, MARL and Its Warehouse Applications [2.969705152497174]
We study two state-of-the-art solutions to the multi-agent pickup and delivery problem based on different principles. Specifically, a recent MAPF algorithm called conflict-based search (CBS) and a current MARL algorithm called shared experience actor-critic (SEAC) are studied.
arXiv Detail & Related papers (2022-03-14T13:23:35Z)
Meta Navigator: Search for a Good Adaptation Policy for Few-shot Learning [113.05118113697111]
Few-shot learning aims to adapt knowledge learned from previous tasks to novel tasks with only a limited amount of labeled data. Research literature on few-shot learning exhibits great diversity, while different algorithms often excel at different few-shot learning scenarios. We present Meta Navigator, a framework that attempts to solve the limitation in few-shot learning by seeking a higher-level strategy.
arXiv Detail & Related papers (2021-09-13T07:20:01Z)
Survey of Recent Multi-Agent Reinforcement Learning Algorithms Utilizing Centralized Training [0.7588690078299698]
We discuss variations of centralized training and describe a recent survey of algorithmic approaches. The goal is to explore how different implementations of information sharing mechanism in centralized learning may give rise to distinct group coordinated behaviors.
arXiv Detail & Related papers (2021-07-29T20:29:12Z)
Softmax with Regularization: Better Value Estimation in Multi-Agent Reinforcement Learning [72.28520951105207]
Overestimation in $Q$-learning is an important problem that has been extensively studied in single-agent reinforcement learning. We propose a novel regularization-based update scheme that penalizes large joint action-values deviating from a baseline. We show that our method provides a consistent performance improvement on a set of challenging StarCraft II micromanagement tasks.
arXiv Detail & Related papers (2021-03-22T14:18:39Z)
UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning [53.73686229912562]
We propose a novel MARL approach called Universal Value Exploration (UneVEn) UneVEn learns a set of related tasks simultaneously with a linear decomposition of universal successor features. Empirical results on a set of exploration games, challenging cooperative predator-prey tasks requiring significant coordination among agents, and StarCraft II micromanagement benchmarks show that UneVEn can solve tasks where other state-of-the-art MARL methods fail.
arXiv Detail & Related papers (2020-10-06T19:08:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.