SEIHAI: A Sample-efficient Hierarchical AI for the MineRL Competition
- URL: http://arxiv.org/abs/2111.08857v1
- Date: Wed, 17 Nov 2021 01:36:40 GMT
- Title: SEIHAI: A Sample-efficient Hierarchical AI for the MineRL Competition
- Authors: Hangyu Mao, Chao Wang, Xiaotian Hao, Yihuan Mao, Yiming Lu, Chengjie
Wu, Jianye Hao, Dong Li and Pingzhong Tang
- Abstract summary: We present textbfSEIHAI, a textbfSample-textbfefftextbficient textbfHierarchical textbfAI, that takes advantage of the human demonstrations and the task structure.
Specifically, we split the task into several sequentially dependent subtasks, and train a suitable agent for each subtask using reinforcement learning and imitation learning.
SEIHAI takes the first place in the preliminary and final of the NeurIPS-2020 MineRL competition.
- Score: 32.635756704572266
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The MineRL competition is designed for the development of reinforcement
learning and imitation learning algorithms that can efficiently leverage human
demonstrations to drastically reduce the number of environment interactions
needed to solve the complex \emph{ObtainDiamond} task with sparse rewards. To
address the challenge, in this paper, we present \textbf{SEIHAI}, a
\textbf{S}ample-\textbf{e}ff\textbf{i}cient \textbf{H}ierarchical \textbf{AI},
that fully takes advantage of the human demonstrations and the task structure.
Specifically, we split the task into several sequentially dependent subtasks,
and train a suitable agent for each subtask using reinforcement learning and
imitation learning. We further design a scheduler to select different agents
for different subtasks automatically. SEIHAI takes the first place in the
preliminary and final of the NeurIPS-2020 MineRL competition.
Related papers
- Heterogeneous Graph Reinforcement Learning for Dependency-aware Multi-task Allocation in Spatial Crowdsourcing [33.915222518617085]
This paper formally investigates the problem of Dependency-aware Multi-task Allocation (DMA)
It presents a well-designed framework to solve it, known as Heterogeneous Graph Reinforcement Learning-based Task Allocation (HGRL-TA)
Experiment results demonstrate the effectiveness and generality of the proposed HGRL-TA in solving the DMA problem, leading to average profits that is 21.78% higher than those achieved using the metaheuristic methods.
arXiv Detail & Related papers (2024-10-20T17:00:45Z) - Data-CUBE: Data Curriculum for Instruction-based Sentence Representation
Learning [85.66907881270785]
We propose a data curriculum method, namely Data-CUBE, that arranges the orders of all the multi-task data for training.
In the task level, we aim to find the optimal task order to minimize the total cross-task interference risk.
In the instance level, we measure the difficulty of all instances per task, then divide them into the easy-to-difficult mini-batches for training.
arXiv Detail & Related papers (2024-01-07T18:12:20Z) - Enhancing Robotic Manipulation: Harnessing the Power of Multi-Task
Reinforcement Learning and Single Life Reinforcement Learning in Meta-World [0.0]
This research project is to enable a robotic arm to execute seven distinct tasks within the Meta World environment.
A trained model will serve as a source of prior data for the single-life RL algorithm.
An ablation study demonstrates that MT-QWALE successfully completes tasks with a slightly larger number of steps even after hiding the final goal position.
arXiv Detail & Related papers (2023-10-23T06:35:44Z) - Semantically Aligned Task Decomposition in Multi-Agent Reinforcement
Learning [56.26889258704261]
We propose a novel "disentangled" decision-making method, Semantically Aligned task decomposition in MARL (SAMA)
SAMA prompts pretrained language models with chain-of-thought that can suggest potential goals, provide suitable goal decomposition and subgoal allocation as well as self-reflection-based replanning.
SAMA demonstrates considerable advantages in sample efficiency compared to state-of-the-art ASG methods.
arXiv Detail & Related papers (2023-05-18T10:37:54Z) - Robust Subtask Learning for Compositional Generalization [20.54144051436337]
We focus on the problem of training subtask policies in a way that they can be used to perform any task.
We aim to maximize the worst-case performance over all tasks as opposed to the average-case performance.
arXiv Detail & Related papers (2023-02-06T18:19:25Z) - Reinforcement Learning with Success Induced Task Prioritization [68.8204255655161]
We introduce Success Induced Task Prioritization (SITP), a framework for automatic curriculum learning.
The algorithm selects the order of tasks that provide the fastest learning for agents.
We demonstrate that SITP matches or surpasses the results of other curriculum design methods.
arXiv Detail & Related papers (2022-12-30T12:32:43Z) - Fast Inference and Transfer of Compositional Task Structures for
Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph.
Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks.
Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z) - The MineRL 2020 Competition on Sample Efficient Reinforcement Learning
using Human Priors [62.9301667732188]
We propose a second iteration of the MineRL Competition.
The primary goal of the competition is to foster the development of algorithms which can efficiently leverage human demonstrations.
The competition is structured into two rounds in which competitors are provided several paired versions of the dataset and environment.
At the end of each round, competitors submit containerized versions of their learning algorithms to the AIcrowd platform.
arXiv Detail & Related papers (2021-01-26T20:32:30Z) - Meta Reinforcement Learning with Autonomous Inference of Subtask
Dependencies [57.27944046925876]
We propose and address a novel few-shot RL problem, where a task is characterized by a subtask graph.
Instead of directly learning a meta-policy, we develop a Meta-learner with Subtask Graph Inference.
Our experiment results on two grid-world domains and StarCraft II environments show that the proposed method is able to accurately infer the latent task parameter.
arXiv Detail & Related papers (2020-01-01T17:34:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.