Exploring the Edges of Latent State Clusters for Goal-Conditioned Reinforcement Learning
- URL: http://arxiv.org/abs/2411.01396v1
- Date: Sun, 03 Nov 2024 01:21:43 GMT
- Title: Exploring the Edges of Latent State Clusters for Goal-Conditioned Reinforcement Learning
- Authors: Yuanlin Duan, Guofeng Cui, He Zhu,
- Abstract summary: "Cluster Edge Exploration" ($CE2$) is a new goal-directed exploration algorithm that gives priority to goal states that remain accessible to the agent.
In challenging robotics environments, $CE2$ demonstrates superior efficiency in exploration compared to baseline methods and ablations.
- Score: 6.266160051617362
- License:
- Abstract: Exploring unknown environments efficiently is a fundamental challenge in unsupervised goal-conditioned reinforcement learning. While selecting exploratory goals at the frontier of previously explored states is an effective strategy, the policy during training may still have limited capability of reaching rare goals on the frontier, resulting in reduced exploratory behavior. We propose "Cluster Edge Exploration" ($CE^2$), a new goal-directed exploration algorithm that when choosing goals in sparsely explored areas of the state space gives priority to goal states that remain accessible to the agent. The key idea is clustering to group states that are easily reachable from one another by the current policy under training in a latent space and traversing to states holding significant exploration potential on the boundary of these clusters before doing exploratory behavior. In challenging robotics environments including navigating a maze with a multi-legged ant robot, manipulating objects with a robot arm on a cluttered tabletop, and rotating objects in the palm of an anthropomorphic robotic hand, $CE^2$ demonstrates superior efficiency in exploration compared to baseline methods and ablations.
Related papers
- Planning Goals for Exploration [22.047797646698527]
"Planning Exploratory Goals" (PEG) is a method that sets goals for each training episode to directly optimize an intrinsic exploration reward.
PEG learns world models and adapts sampling-based planning algorithms to "plan goal commands"
arXiv Detail & Related papers (2023-03-23T02:51:50Z) - Scaling Goal-based Exploration via Pruning Proto-goals [10.976262029859424]
One of the gnarliest challenges in reinforcement learning is exploration that scales to vast domains.
Goal-directed, purposeful behaviours are able to overcome this, but rely on a good goal space.
Our approach explicitly seeks the middle ground, enabling the human designer to specify a vast but meaningful proto-goal space.
arXiv Detail & Related papers (2023-02-09T15:22:09Z) - Deep Hierarchical Planning from Pixels [86.14687388689204]
Director is a method for learning hierarchical behaviors directly from pixels by planning inside the latent space of a learned world model.
Despite operating in latent space, the decisions are interpretable because the world model can decode goals into images for visualization.
Director also learns successful behaviors across a wide range of environments, including visual control, Atari games, and DMLab levels.
arXiv Detail & Related papers (2022-06-08T18:20:15Z) - Successor Feature Landmarks for Long-Horizon Goal-Conditioned
Reinforcement Learning [54.378444600773875]
We introduce Successor Feature Landmarks (SFL), a framework for exploring large, high-dimensional environments.
SFL drives exploration by estimating state-novelty and enables high-level planning by abstracting the state-space as a non-parametric landmark-based graph.
We show in our experiments on MiniGrid and ViZDoom that SFL enables efficient exploration of large, high-dimensional state spaces.
arXiv Detail & Related papers (2021-11-18T18:36:05Z) - Landmark-Guided Subgoal Generation in Hierarchical Reinforcement
Learning [64.97599673479678]
We present HIerarchical reinforcement learning Guided by Landmarks (HIGL)
HIGL is a novel framework for training a high-level policy with a reduced action space guided by landmarks.
Our experiments demonstrate that our framework outperforms prior-arts across a variety of control tasks.
arXiv Detail & Related papers (2021-10-26T12:16:19Z) - Cooperative Exploration for Multi-Agent Deep Reinforcement Learning [127.4746863307944]
We propose cooperative multi-agent exploration (CMAE) for deep reinforcement learning.
The goal is selected from multiple projected state spaces via a normalized entropy-based technique.
We demonstrate that CMAE consistently outperforms baselines on various tasks.
arXiv Detail & Related papers (2021-07-23T20:06:32Z) - Batch Exploration with Examples for Scalable Robotic Reinforcement
Learning [63.552788688544254]
Batch Exploration with Examples (BEE) explores relevant regions of the state-space guided by a modest number of human provided images of important states.
BEE is able to tackle challenging vision-based manipulation tasks both in simulation and on a real Franka robot.
arXiv Detail & Related papers (2020-10-22T17:49:25Z) - Follow the Object: Curriculum Learning for Manipulation Tasks with
Imagined Goals [8.98526174345299]
This paper introduces a notion of imaginary object goals.
For a given manipulation task, the object of interest is first trained to reach a desired target position on its own.
The object policy is then leveraged to build a predictive model of plausible object trajectories.
The proposed algorithm, Follow the Object, has been evaluated on 7 MuJoCo environments.
arXiv Detail & Related papers (2020-08-05T12:19:14Z) - LEAF: Latent Exploration Along the Frontier [47.304858727365094]
Self-supervised goal proposal and reaching is a key component for exploration and efficient policy learning algorithms.
We propose an exploration framework, which learns a dynamics-aware manifold of reachable states.
We demonstrate that the proposed self-supervised exploration algorithm, superior performance compared to existing baselines on a set of challenging robotic environments.
arXiv Detail & Related papers (2020-05-21T22:46:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.