How much can change in a year? Revisiting Evaluation in Multi-Agent
Reinforcement Learning
- URL: http://arxiv.org/abs/2312.08463v2
- Date: Fri, 26 Jan 2024 12:46:42 GMT
- Title: How much can change in a year? Revisiting Evaluation in Multi-Agent
Reinforcement Learning
- Authors: Siddarth Singh, Omayma Mahjoub, Ruan de Kock, Wiem Khlifi, Abidine
Vall, Kale-ab Tessera and Arnu Pretorius
- Abstract summary: We extend the database of evaluation methodology previously published by containing meta-data on MARL publications from top-rated conferences.
We compare the findings extracted from this updated database to the trends identified in their work.
We do observe a trend towards more difficult scenarios in SMAC-v1, which if continued into SMAC-v2 will encourage novel algorithmic development.
- Score: 4.653136482223517
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Establishing sound experimental standards and rigour is important in any
growing field of research. Deep Multi-Agent Reinforcement Learning (MARL) is
one such nascent field. Although exciting progress has been made, MARL has
recently come under scrutiny for replicability issues and a lack of
standardised evaluation methodology, specifically in the cooperative setting.
Although protocols have been proposed to help alleviate the issue, it remains
important to actively monitor the health of the field. In this work, we extend
the database of evaluation methodology previously published by containing
meta-data on MARL publications from top-rated conferences and compare the
findings extracted from this updated database to the trends identified in their
work. Our analysis shows that many of the worrying trends in performance
reporting remain. This includes the omission of uncertainty quantification, not
reporting all relevant evaluation details and a narrowing of algorithmic
development classes. Promisingly, we do observe a trend towards more difficult
scenarios in SMAC-v1, which if continued into SMAC-v2 will encourage novel
algorithmic development. Our data indicate that replicability needs to be
approached more proactively by the MARL community to ensure trust in the field
as we move towards exciting new frontiers.
Related papers
- Machine Learning for Missing Value Imputation [0.0]
The main objective of this article is to conduct a comprehensive and rigorous review, as well as analysis, of the state-of-the-art machine learning applications in Missing Value Imputation.
More than 100 articles published between 2014 and 2023 are critically reviewed, considering the methods and findings.
The latest literature is examined to scrutinize the trends in MVI methods and their evaluation.
arXiv Detail & Related papers (2024-10-10T18:56:49Z) - Dispelling the Mirage of Progress in Offline MARL through Standardised Baselines and Evaluation [3.5490824406092405]
offline multi-agent reinforcement learning (MARL) is an emerging field with great promise for real-world applications.
The current state of research in offline MARL is plagued by inconsistencies in baselines and evaluation protocols.
arXiv Detail & Related papers (2024-06-13T12:54:29Z) - Decision Mamba: A Multi-Grained State Space Model with Self-Evolution Regularization for Offline RL [57.202733701029594]
Decision Mamba is a novel multi-grained state space model with a self-evolving policy learning strategy.
To mitigate the overfitting issue on noisy trajectories, a self-evolving policy is proposed by using progressive regularization.
The policy evolves by using its own past knowledge to refine the suboptimal actions, thus enhancing its robustness on noisy demonstrations.
arXiv Detail & Related papers (2024-06-08T10:12:00Z) - RL in Latent MDPs is Tractable: Online Guarantees via Off-Policy Evaluation [73.2390735383842]
We introduce the first sample-efficient algorithm for LMDPs without any additional structural assumptions.
We show how these can be used to derive near-optimal guarantees of an optimistic exploration algorithm.
These results can be valuable for a wide range of interactive learning problems beyond LMDPs, and especially, for partially observed environments.
arXiv Detail & Related papers (2024-06-03T14:51:27Z) - Robust Multi-Agent Reinforcement Learning via Adversarial
Regularization: Theoretical Foundation and Stable Algorithms [79.61176746380718]
Multi-Agent Reinforcement Learning (MARL) has shown promising results across several domains.
MARL policies often lack robustness and are sensitive to small changes in their environment.
We show that we can gain robustness by controlling a policy's Lipschitz constant.
We propose a new robust MARL framework, ERNIE, that promotes the Lipschitz continuity of the policies.
arXiv Detail & Related papers (2023-10-16T20:14:06Z) - MA2CL:Masked Attentive Contrastive Learning for Multi-Agent
Reinforcement Learning [128.19212716007794]
We propose an effective framework called textbfMulti-textbfAgent textbfMasked textbfAttentive textbfContrastive textbfLearning (MA2CL)
MA2CL encourages learning representation to be both temporal and agent-level predictive by reconstructing the masked agent observation in latent space.
Our method significantly improves the performance and sample efficiency of different MARL algorithms and outperforms other methods in various vision-based and state-based scenarios.
arXiv Detail & Related papers (2023-06-03T05:32:19Z) - Towards a Standardised Performance Evaluation Protocol for Cooperative
MARL [2.2977300225306583]
Multi-agent reinforcement learning (MARL) has emerged as a useful approach to solving decentralised decision-making problems at scale.
We take a closer look at this rapid development with a focus on evaluation methodologies employed across a large body of research in cooperative MARL.
We propose a standardised performance evaluation protocol for cooperative MARL.
arXiv Detail & Related papers (2022-09-21T16:40:03Z) - PAC: Assisted Value Factorisation with Counterfactual Predictions in
Multi-Agent Reinforcement Learning [43.862956745961654]
Multi-agent reinforcement learning (MARL) has witnessed significant progress with the development of value function factorization methods.
In this paper, we show that in partially observable MARL problems, an agent's ordering over its own actions could impose concurrent constraints.
We propose PAC, a new framework leveraging information generated from Counterfactual Predictions of optimal joint action selection.
arXiv Detail & Related papers (2022-06-22T23:34:30Z) - Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting
Pot [71.28884625011987]
Melting Pot is a MARL evaluation suite that uses reinforcement learning to reduce the human labor required to create novel test scenarios.
We have created over 80 unique test scenarios covering a broad range of research topics.
We apply these test scenarios to standard MARL training algorithms, and demonstrate how Melting Pot reveals weaknesses not apparent from training performance alone.
arXiv Detail & Related papers (2021-07-14T17:22:14Z) - MS MARCO: Benchmarking Ranking Models in the Large-Data Regime [57.37239054770001]
This paper uses the MS MARCO and TREC Deep Learning Track as our case study.
We show how the design of the evaluation effort can encourage or discourage certain outcomes.
We provide some analysis of certain pitfalls, and a statement of best practices for avoiding such pitfalls.
arXiv Detail & Related papers (2021-05-09T20:57:36Z) - Information State Embedding in Partially Observable Cooperative
Multi-Agent Reinforcement Learning [19.617644643147948]
We introduce the concept of an information state embedding that serves to compress agents' histories.
We quantify how the compression error influences the resulting value functions for decentralized control.
The proposed embed-then-learn pipeline opens the black-box of existing (partially observable) MARL algorithms.
arXiv Detail & Related papers (2020-04-02T16:03:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.