Towards robust and domain agnostic reinforcement learning competitions
- URL: http://arxiv.org/abs/2106.03748v1
- Date: Mon, 7 Jun 2021 16:15:46 GMT
- Title: Towards robust and domain agnostic reinforcement learning competitions
- Authors: William Hebgen Guss, Stephanie Milani, Nicholay Topin, Brandon
Houghton, Sharada Mohanty, Andrew Melnik, Augustin Harter, Benoit Buschmaas,
Bjarne Jaster, Christoph Berganski, Dennis Heitkamp, Marko Henning, Helge
Ritter, Chengjie Wu, Xiaotian Hao, Yiming Lu, Hangyu Mao, Yihuan Mao, Chao
Wang, Michal Opanowicz, Anssi Kanervisto, Yanick Schraner, Christian
Scheller, Xiren Zhou, Lu Liu, Daichi Nishio, Toi Tsuneda, Karolis
Ramanauskas, Gabija Juceviciute
- Abstract summary: Reinforcement learning competitions have formed the basis for standard research benchmarks.
Despite this, a majority of challenges suffer from the same fundamental problems.
We present a new framework of competition design that promotes the development of algorithms that overcome these barriers.
- Score: 12.731614722371376
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning competitions have formed the basis for standard
research benchmarks, galvanized advances in the state-of-the-art, and shaped
the direction of the field. Despite this, a majority of challenges suffer from
the same fundamental problems: participant solutions to the posed challenge are
usually domain-specific, biased to maximally exploit compute resources, and not
guaranteed to be reproducible. In this paper, we present a new framework of
competition design that promotes the development of algorithms that overcome
these barriers. We propose four central mechanisms for achieving this end:
submission retraining, domain randomization, desemantization through domain
obfuscation, and the limitation of competition compute and environment-sample
budget. To demonstrate the efficacy of this design, we proposed, organized, and
ran the MineRL 2020 Competition on Sample-Efficient Reinforcement Learning. In
this work, we describe the organizational outcomes of the competition and show
that the resulting participant submissions are reproducible, non-specific to
the competition environment, and sample/resource efficient, despite the
difficult competition task.
Related papers
- Are we making progress in unlearning? Findings from the first NeurIPS unlearning competition [70.60872754129832]
First NeurIPS competition on unlearning sought to stimulate the development of novel algorithms.
Nearly 1,200 teams from across the world participated.
We analyze top solutions and delve into discussions on benchmarking unlearning.
arXiv Detail & Related papers (2024-06-13T12:58:00Z) - Distribution-Free Fair Federated Learning with Small Samples [54.63321245634712]
FedFaiREE is a post-processing algorithm developed specifically for distribution-free fair learning in decentralized settings with small samples.
We provide rigorous theoretical guarantees for both fairness and accuracy, and our experimental results further provide robust empirical validation for our proposed method.
arXiv Detail & Related papers (2024-02-25T17:37:53Z) - CompeteSMoE -- Effective Training of Sparse Mixture of Experts via
Competition [52.2034494666179]
Sparse mixture of experts (SMoE) offers an appealing solution to scale up the model complexity beyond the mean of increasing the network's depth or width.
We propose a competition mechanism to address this fundamental challenge of representation collapse.
By routing inputs only to experts with the highest neural response, we show that, under mild assumptions, competition enjoys the same convergence rate as the optimal estimator.
arXiv Detail & Related papers (2024-02-04T15:17:09Z) - Benchmarking Robustness and Generalization in Multi-Agent Systems: A
Case Study on Neural MMO [50.58083807719749]
We present the results of the second Neural MMO challenge, hosted at IJCAI 2022, which received 1600+ submissions.
This competition targets robustness and generalization in multi-agent systems.
We will open-source our benchmark including the environment wrapper, baselines, a visualization tool, and selected policies for further research.
arXiv Detail & Related papers (2023-08-30T07:16:11Z) - A portfolio-based analysis method for competition results [0.8680676599607126]
I will describe a portfolio-based analysis method which can give complementary insights into the performance of participating solvers in a competition.
The method is demonstrated on the results of the MiniZinc Challenges and new insights gained from the portfolio viewpoint are presented.
arXiv Detail & Related papers (2022-05-30T20:20:45Z) - Multi-Stage Decentralized Matching Markets: Uncertain Preferences and
Strategic Behaviors [91.3755431537592]
This article develops a framework for learning optimal strategies in real-world matching markets.
We show that there exists a welfare-versus-fairness trade-off that is characterized by the uncertainty level of acceptance.
We prove that participants can be better off with multi-stage matching compared to single-stage matching.
arXiv Detail & Related papers (2021-02-13T19:25:52Z) - The MineRL 2020 Competition on Sample Efficient Reinforcement Learning
using Human Priors [62.9301667732188]
We propose a second iteration of the MineRL Competition.
The primary goal of the competition is to foster the development of algorithms which can efficiently leverage human demonstrations.
The competition is structured into two rounds in which competitors are provided several paired versions of the dataset and environment.
At the end of each round, competitors submit containerized versions of their learning algorithms to the AIcrowd platform.
arXiv Detail & Related papers (2021-01-26T20:32:30Z) - Retrospective Analysis of the 2019 MineRL Competition on Sample
Efficient Reinforcement Learning [27.440055101691115]
We held the MineRL Competition on Sample Efficient Reinforcement Learning Using Human Priors at the Thirty-third Conference on Neural Information Processing Systems (NeurIPS)
The primary goal of this competition was to promote the development of algorithms that use human demonstrations alongside reinforcement learning to reduce the number of samples needed to solve complex, hierarchical, and sparse environments.
arXiv Detail & Related papers (2020-03-10T21:39:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.