Towards a Standardised Performance Evaluation Protocol for Cooperative
MARL
- URL: http://arxiv.org/abs/2209.10485v1
- Date: Wed, 21 Sep 2022 16:40:03 GMT
- Title: Towards a Standardised Performance Evaluation Protocol for Cooperative
MARL
- Authors: Rihab Gorsane, Omayma Mahjoub, Ruan de Kock, Roland Dubb, Siddarth
Singh, Arnu Pretorius
- Abstract summary: Multi-agent reinforcement learning (MARL) has emerged as a useful approach to solving decentralised decision-making problems at scale.
We take a closer look at this rapid development with a focus on evaluation methodologies employed across a large body of research in cooperative MARL.
We propose a standardised performance evaluation protocol for cooperative MARL.
- Score: 2.2977300225306583
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-agent reinforcement learning (MARL) has emerged as a useful approach to
solving decentralised decision-making problems at scale. Research in the field
has been growing steadily with many breakthrough algorithms proposed in recent
years. In this work, we take a closer look at this rapid development with a
focus on evaluation methodologies employed across a large body of research in
cooperative MARL. By conducting a detailed meta-analysis of prior work,
spanning 75 papers accepted for publication from 2016 to 2022, we bring to
light worrying trends that put into question the true rate of progress. We
further consider these trends in a wider context and take inspiration from
single-agent RL literature on similar issues with recommendations that remain
applicable to MARL. Combining these recommendations, with novel insights from
our analysis, we propose a standardised performance evaluation protocol for
cooperative MARL. We argue that such a standard protocol, if widely adopted,
would greatly improve the validity and credibility of future research, make
replication and reproducibility easier, as well as improve the ability of the
field to accurately gauge the rate of progress over time by being able to make
sound comparisons across different works. Finally, we release our meta-analysis
data publicly on our project website for future research on evaluation:
https://sites.google.com/view/marl-standard-protocol
Related papers
- Dispelling the Mirage of Progress in Offline MARL through Standardised Baselines and Evaluation [3.5490824406092405]
offline multi-agent reinforcement learning (MARL) is an emerging field with great promise for real-world applications.
The current state of research in offline MARL is plagued by inconsistencies in baselines and evaluation protocols.
arXiv Detail & Related papers (2024-06-13T12:54:29Z) - Peer Review as A Multi-Turn and Long-Context Dialogue with Role-Based Interactions [62.0123588983514]
Large Language Models (LLMs) have demonstrated wide-ranging applications across various fields.
We reformulate the peer-review process as a multi-turn, long-context dialogue, incorporating distinct roles for authors, reviewers, and decision makers.
We construct a comprehensive dataset containing over 26,841 papers with 92,017 reviews collected from multiple sources.
arXiv Detail & Related papers (2024-06-09T08:24:17Z) - Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning [55.96599486604344]
We introduce an approach aimed at enhancing the reasoning capabilities of Large Language Models (LLMs) through an iterative preference learning process.
We use Monte Carlo Tree Search (MCTS) to iteratively collect preference data, utilizing its look-ahead ability to break down instance-level rewards into more granular step-level signals.
The proposed algorithm employs Direct Preference Optimization (DPO) to update the LLM policy using this newly generated step-level preference data.
arXiv Detail & Related papers (2024-05-01T11:10:24Z) - Expanding Horizons in HCI Research Through LLM-Driven Qualitative
Analysis [3.5253513747455303]
We introduce a new approach to qualitative analysis in HCI using Large Language Models (LLMs)
Our findings indicate that LLMs not only match the efficacy of traditional analysis methods but also offer unique insights.
arXiv Detail & Related papers (2024-01-07T12:39:31Z) - How much can change in a year? Revisiting Evaluation in Multi-Agent
Reinforcement Learning [4.653136482223517]
We extend the database of evaluation methodology previously published by containing meta-data on MARL publications from top-rated conferences.
We compare the findings extracted from this updated database to the trends identified in their work.
We do observe a trend towards more difficult scenarios in SMAC-v1, which if continued into SMAC-v2 will encourage novel algorithmic development.
arXiv Detail & Related papers (2023-12-13T19:06:34Z) - Let's reward step by step: Step-Level reward model as the Navigators for
Reasoning [64.27898739929734]
Process-Supervised Reward Model (PRM) furnishes LLMs with step-by-step feedback during the training phase.
We propose a greedy search algorithm that employs the step-level feedback from PRM to optimize the reasoning pathways explored by LLMs.
To explore the versatility of our approach, we develop a novel method to automatically generate step-level reward dataset for coding tasks and observed similar improved performance in the code generation tasks.
arXiv Detail & Related papers (2023-10-16T05:21:50Z) - Generative Judge for Evaluating Alignment [84.09815387884753]
We propose a generative judge with 13B parameters, Auto-J, designed to address these challenges.
Our model is trained on user queries and LLM-generated responses under massive real-world scenarios.
Experimentally, Auto-J outperforms a series of strong competitors, including both open-source and closed-source models.
arXiv Detail & Related papers (2023-10-09T07:27:15Z) - Model-based Multi-agent Reinforcement Learning: Recent Progress and
Prospects [23.347535672670688]
Multi-Agent Reinforcement Learning (MARL) tackles sequential decision-making problems involving multiple participants.
MARL requires a tremendous number of samples for effective training.
Model-based methods have been shown to achieve provable advantages of sample efficiency.
arXiv Detail & Related papers (2022-03-20T17:24:47Z) - Efficient Model-based Multi-agent Reinforcement Learning via Optimistic
Equilibrium Computation [93.52573037053449]
H-MARL (Hallucinated Multi-Agent Reinforcement Learning) learns successful equilibrium policies after a few interactions with the environment.
We demonstrate our approach experimentally on an autonomous driving simulation benchmark.
arXiv Detail & Related papers (2022-03-14T17:24:03Z) - MS MARCO: Benchmarking Ranking Models in the Large-Data Regime [57.37239054770001]
This paper uses the MS MARCO and TREC Deep Learning Track as our case study.
We show how the design of the evaluation effort can encourage or discourage certain outcomes.
We provide some analysis of certain pitfalls, and a statement of best practices for avoiding such pitfalls.
arXiv Detail & Related papers (2021-05-09T20:57:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.