Retrospective on the 2021 BASALT Competition on Learning from Human
Feedback
- URL: http://arxiv.org/abs/2204.07123v1
- Date: Thu, 14 Apr 2022 17:24:54 GMT
- Title: Retrospective on the 2021 BASALT Competition on Learning from Human
Feedback
- Authors: Rohin Shah, Steven H. Wang, Cody Wild, Stephanie Milani, Anssi
Kanervisto, Vinicius G. Goecks, Nicholas Waytowich, David Watkins-Valls,
Bharat Prakash, Edmund Mills, Divyansh Garg, Alexander Fries, Alexandra
Souly, Chan Jun Shern, Daniel del Castillo, Tom Lieberum
- Abstract summary: The goal of the competition was to promote research towards agents that use learning from human feedback (LfHF) techniques to solve open-world tasks.
Rather than mandating the use of LfHF techniques, we described four tasks in natural language to be accomplished in the video game Minecraft.
Teams developed a diverse range of LfHF algorithms across a variety of possible human feedback types.
- Score: 92.37243979045817
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We held the first-ever MineRL Benchmark for Agents that Solve Almost-Lifelike
Tasks (MineRL BASALT) Competition at the Thirty-fifth Conference on Neural
Information Processing Systems (NeurIPS 2021). The goal of the competition was
to promote research towards agents that use learning from human feedback (LfHF)
techniques to solve open-world tasks. Rather than mandating the use of LfHF
techniques, we described four tasks in natural language to be accomplished in
the video game Minecraft, and allowed participants to use any approach they
wanted to build agents that could accomplish the tasks. Teams developed a
diverse range of LfHF algorithms across a variety of possible human feedback
types. The three winning teams implemented significantly different approaches
while achieving similar performance. Interestingly, their approaches performed
well on different tasks, validating our choice of tasks to include in the
competition. While the outcomes validated the design of our competition, we did
not get as many participants and submissions as our sister competition, MineRL
Diamond. We speculate about the causes of this problem and suggest improvements
for future iterations of the competition.
Related papers
- Benchmarking Robustness and Generalization in Multi-Agent Systems: A
Case Study on Neural MMO [50.58083807719749]
We present the results of the second Neural MMO challenge, hosted at IJCAI 2022, which received 1600+ submissions.
This competition targets robustness and generalization in multi-agent systems.
We will open-source our benchmark including the environment wrapper, baselines, a visualization tool, and selected policies for further research.
arXiv Detail & Related papers (2023-08-30T07:16:11Z) - Towards Solving Fuzzy Tasks with Human Feedback: A Retrospective of the
MineRL BASALT 2022 Competition [20.922425732605756]
The BASALT challenge asks teams to compete to develop algorithms to solve tasks with hard-to-specify reward functions in Minecraft.
We describe the competition and provide an overview of the top solutions.
arXiv Detail & Related papers (2023-03-23T17:59:17Z) - MineRL Diamond 2021 Competition: Overview, Results, and Lessons Learned [60.11039031794829]
Reinforcement learning competitions advance the field by providing appropriate scope and support to develop solutions toward a specific problem.
We hosted the third edition of the MineRL ObtainDiamond competition, MineRL Diamond 2021, with a separate track in which we permitted any solution to promote the participation of newcomers.
The participants of this easier track were able to obtain a diamond, and the participants of the harder track progressed the generalizable solutions in the same task.
arXiv Detail & Related papers (2022-02-17T13:37:35Z) - The MineRL BASALT Competition on Learning from Human Feedback [58.17897225617566]
The MineRL BASALT competition aims to spur forward research on this important class of techniques.
We design a suite of four tasks in Minecraft for which we expect it will be hard to write down hardcoded reward functions.
We provide a dataset of human demonstrations on each of the four tasks, as well as an imitation learning baseline.
arXiv Detail & Related papers (2021-07-05T12:18:17Z) - The MineRL 2020 Competition on Sample Efficient Reinforcement Learning
using Human Priors [62.9301667732188]
We propose a second iteration of the MineRL Competition.
The primary goal of the competition is to foster the development of algorithms which can efficiently leverage human demonstrations.
The competition is structured into two rounds in which competitors are provided several paired versions of the dataset and environment.
At the end of each round, competitors submit containerized versions of their learning algorithms to the AIcrowd platform.
arXiv Detail & Related papers (2021-01-26T20:32:30Z) - Retrospective Analysis of the 2019 MineRL Competition on Sample
Efficient Reinforcement Learning [27.440055101691115]
We held the MineRL Competition on Sample Efficient Reinforcement Learning Using Human Priors at the Thirty-third Conference on Neural Information Processing Systems (NeurIPS)
The primary goal of this competition was to promote the development of algorithms that use human demonstrations alongside reinforcement learning to reduce the number of samples needed to solve complex, hierarchical, and sparse environments.
arXiv Detail & Related papers (2020-03-10T21:39:52Z) - Analysing Affective Behavior in the First ABAW 2020 Competition [49.90617840789334]
The Affective Behavior Analysis in-the-wild (ABAW) 2020 Competition is the first Competition aiming at automatic analysis of the three main behavior tasks.
We describe this Competition, to be held in conjunction with the IEEE Conference on Face and Gesture Recognition, May 2020, in Buenos Aires, Argentina.
We outline the evaluation metrics, present both the baseline system and the top-3 performing teams' methodologies per Challenge and finally present their obtained results.
arXiv Detail & Related papers (2020-01-30T15:41:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.