LEAF: Latent Exploration Along the Frontier
- URL: http://arxiv.org/abs/2005.10934v3
- Date: Mon, 26 Apr 2021 18:05:56 GMT
- Title: LEAF: Latent Exploration Along the Frontier
- Authors: Homanga Bharadhwaj, Animesh Garg, Florian Shkurti
- Abstract summary: Self-supervised goal proposal and reaching is a key component for exploration and efficient policy learning algorithms.
We propose an exploration framework, which learns a dynamics-aware manifold of reachable states.
We demonstrate that the proposed self-supervised exploration algorithm, superior performance compared to existing baselines on a set of challenging robotic environments.
- Score: 47.304858727365094
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-supervised goal proposal and reaching is a key component for exploration
and efficient policy learning algorithms. Such a self-supervised approach
without access to any oracle goal sampling distribution requires deep
exploration and commitment so that long horizon plans can be efficiently
discovered. In this paper, we propose an exploration framework, which learns a
dynamics-aware manifold of reachable states. For a goal, our proposed method
deterministically visits a state at the current frontier of reachable states
(commitment/reaching) and then stochastically explores to reach the goal
(exploration). This allocates exploration budget near the frontier of the
reachable region instead of its interior. We target the challenging problem of
policy learning from initial and goal states specified as images, and do not
assume any access to the underlying ground-truth states of the robot and the
environment. To keep track of reachable latent states, we propose a
distance-conditioned reachability network that is trained to infer whether one
state is reachable from another within the specified latent space distance.
Given an initial state, we obtain a frontier of reachable states from that
state. By incorporating a curriculum for sampling easier goals (closer to the
start state) before more difficult goals, we demonstrate that the proposed
self-supervised exploration algorithm, superior performance compared to
existing baselines on a set of challenging robotic
environments.https://sites.google.com/view/leaf-exploration
Related papers
- Exploring the Edges of Latent State Clusters for Goal-Conditioned Reinforcement Learning [6.266160051617362]
"Cluster Edge Exploration" ($CE2$) is a new goal-directed exploration algorithm that gives priority to goal states that remain accessible to the agent.
In challenging robotics environments, $CE2$ demonstrates superior efficiency in exploration compared to baseline methods and ablations.
arXiv Detail & Related papers (2024-11-03T01:21:43Z) - Planning Goals for Exploration [22.047797646698527]
"Planning Exploratory Goals" (PEG) is a method that sets goals for each training episode to directly optimize an intrinsic exploration reward.
PEG learns world models and adapts sampling-based planning algorithms to "plan goal commands"
arXiv Detail & Related papers (2023-03-23T02:51:50Z) - Scaling Goal-based Exploration via Pruning Proto-goals [10.976262029859424]
One of the gnarliest challenges in reinforcement learning is exploration that scales to vast domains.
Goal-directed, purposeful behaviours are able to overcome this, but rely on a good goal space.
Our approach explicitly seeks the middle ground, enabling the human designer to specify a vast but meaningful proto-goal space.
arXiv Detail & Related papers (2023-02-09T15:22:09Z) - Successor Feature Landmarks for Long-Horizon Goal-Conditioned
Reinforcement Learning [54.378444600773875]
We introduce Successor Feature Landmarks (SFL), a framework for exploring large, high-dimensional environments.
SFL drives exploration by estimating state-novelty and enables high-level planning by abstracting the state-space as a non-parametric landmark-based graph.
We show in our experiments on MiniGrid and ViZDoom that SFL enables efficient exploration of large, high-dimensional state spaces.
arXiv Detail & Related papers (2021-11-18T18:36:05Z) - Landmark-Guided Subgoal Generation in Hierarchical Reinforcement
Learning [64.97599673479678]
We present HIerarchical reinforcement learning Guided by Landmarks (HIGL)
HIGL is a novel framework for training a high-level policy with a reduced action space guided by landmarks.
Our experiments demonstrate that our framework outperforms prior-arts across a variety of control tasks.
arXiv Detail & Related papers (2021-10-26T12:16:19Z) - Rapid Exploration for Open-World Navigation with Latent Goal Models [78.45339342966196]
We describe a robotic learning system for autonomous exploration and navigation in diverse, open-world environments.
At the core of our method is a learned latent variable model of distances and actions, along with a non-parametric topological memory of images.
We use an information bottleneck to regularize the learned policy, giving us (i) a compact visual representation of goals, (ii) improved generalization capabilities, and (iii) a mechanism for sampling feasible goals for exploration.
arXiv Detail & Related papers (2021-04-12T23:14:41Z) - Batch Exploration with Examples for Scalable Robotic Reinforcement
Learning [63.552788688544254]
Batch Exploration with Examples (BEE) explores relevant regions of the state-space guided by a modest number of human provided images of important states.
BEE is able to tackle challenging vision-based manipulation tasks both in simulation and on a real Franka robot.
arXiv Detail & Related papers (2020-10-22T17:49:25Z) - InfoBot: Transfer and Exploration via the Information Bottleneck [105.28380750802019]
A central challenge in reinforcement learning is discovering effective policies for tasks where rewards are sparsely distributed.
We propose to learn about decision states from prior experience.
We find that this simple mechanism effectively identifies decision states, even in partially observed settings.
arXiv Detail & Related papers (2019-01-30T15:33:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.