Exploring with Sticky Mittens: Reinforcement Learning with Expert
Interventions via Option Templates
- URL: http://arxiv.org/abs/2202.12967v1
- Date: Fri, 25 Feb 2022 20:55:34 GMT
- Title: Exploring with Sticky Mittens: Reinforcement Learning with Expert
Interventions via Option Templates
- Authors: Souradeep Dutta, Kaustubh Sridhar, Osbert Bastani, Edgar Dobriban,
James Weimer, Insup Lee, Julia Parish-Morris
- Abstract summary: We propose a framework for leveraging expert intervention to solve long-horizon reinforcement learning tasks.
We consider option templates, which are specifications encoding a potential option that can be trained using reinforcement learning.
We evaluate our approach on three challenging reinforcement learning problems, showing that it outperforms state-the-art approaches by an order of magnitude.
- Score: 31.836234758355243
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Environments with sparse rewards and long horizons pose a significant
challenge for current reinforcement learning algorithms. A key feature enabling
humans to learn challenging control tasks is that they often receive expert
intervention that enables them to understand the high-level structure of the
task before mastering low-level control actions. We propose a framework for
leveraging expert intervention to solve long-horizon reinforcement learning
tasks. We consider option templates, which are specifications encoding a
potential option that can be trained using reinforcement learning. We formulate
expert intervention as allowing the agent to execute option templates before
learning an implementation. This enables them to use an option, before
committing costly resources to learning it. We evaluate our approach on three
challenging reinforcement learning problems, showing that it outperforms state
of-the-art approaches by an order of magnitude. Project website at
https://sites.google.com/view/stickymittens
Related papers
- SPIRE: Synergistic Planning, Imitation, and Reinforcement Learning for Long-Horizon Manipulation [58.14969377419633]
We propose spire, a system that decomposes tasks into smaller learning subproblems and second combines imitation and reinforcement learning to maximize their strengths.
We find that spire outperforms prior approaches that integrate imitation learning, reinforcement learning, and planning by 35% to 50% in average task performance.
arXiv Detail & Related papers (2024-10-23T17:42:07Z) - Bootstrap Your Own Skills: Learning to Solve New Tasks with Large
Language Model Guidance [66.615355754712]
BOSS learns to accomplish new tasks by performing "skill bootstrapping"
We demonstrate through experiments in realistic household environments that agents trained with our LLM-guided bootstrapping procedure outperform those trained with naive bootstrapping.
arXiv Detail & Related papers (2023-10-16T02:43:47Z) - Teachable Reinforcement Learning via Advice Distillation [161.43457947665073]
We propose a new supervision paradigm for interactive learning based on "teachable" decision-making systems that learn from structured advice provided by an external teacher.
We show that agents that learn from advice can acquire new skills with significantly less human supervision than standard reinforcement learning algorithms.
arXiv Detail & Related papers (2022-03-19T03:22:57Z) - The Paradox of Choice: Using Attention in Hierarchical Reinforcement
Learning [59.777127897688594]
We present an online, model-free algorithm to learn affordances that can be used to further learn subgoal options.
We investigate the role of hard versus soft attention in training data collection, abstract value learning in long-horizon tasks, and handling a growing number of choices.
arXiv Detail & Related papers (2022-01-24T13:18:02Z) - Hierarchical Skills for Efficient Exploration [70.62309286348057]
In reinforcement learning, pre-trained low-level skills have the potential to greatly facilitate exploration.
Prior knowledge of the downstream task is required to strike the right balance between generality (fine-grained control) and specificity (faster learning) in skill design.
We propose a hierarchical skill learning framework that acquires skills of varying complexity in an unsupervised manner.
arXiv Detail & Related papers (2021-10-20T22:29:32Z) - Planning to Explore via Self-Supervised World Models [120.31359262226758]
Plan2Explore is a self-supervised reinforcement learning agent.
We present a new approach to self-supervised exploration and fast adaptation to new tasks.
Without any training supervision or task-specific interaction, Plan2Explore outperforms prior self-supervised exploration methods.
arXiv Detail & Related papers (2020-05-12T17:59:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.