Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon
Reasoning
- URL: http://arxiv.org/abs/2111.03189v1
- Date: Thu, 4 Nov 2021 22:46:16 GMT
- Title: Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon
Reasoning
- Authors: Dhruv Shah, Peng Xu, Yao Lu, Ted Xiao, Alexander Toshev, Sergey
Levine, Brian Ichter
- Abstract summary: Reinforcement learning can train policies that effectively perform complex tasks.
For long-horizon tasks, the performance of these methods degrades with horizon, often necessitating reasoning over and composing lower-level skills.
We propose Value Function Spaces: a simple approach that produces such a representation by using the value functions corresponding to each lower-level skill.
- Score: 120.38381203153159
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reinforcement learning can train policies that effectively perform complex
tasks. However for long-horizon tasks, the performance of these methods
degrades with horizon, often necessitating reasoning over and composing
lower-level skills. Hierarchical reinforcement learning aims to enable this by
providing a bank of low-level skills as action abstractions. Hierarchies can
further improve on this by abstracting the space states as well. We posit that
a suitable state abstraction should depend on the capabilities of the available
lower-level policies. We propose Value Function Spaces: a simple approach that
produces such a representation by using the value functions corresponding to
each lower-level skill. These value functions capture the affordances of the
scene, thus forming a representation that compactly abstracts task relevant
information and robustly ignores distractors. Empirical evaluations for
maze-solving and robotic manipulation tasks demonstrate that our approach
improves long-horizon performance and enables better zero-shot generalization
than alternative model-free and model-based methods.
Related papers
- Efficient Exploration and Discriminative World Model Learning with an Object-Centric Abstraction [19.59151245929067]
We study whether giving an agent an object-centric mapping (describing a set of items and their attributes) allow for more efficient learning.
We find this problem is best solved hierarchically by modelling items at a higher level of state abstraction to pixels.
We make use of this to propose a fully model-based algorithm that learns a discriminative world model.
arXiv Detail & Related papers (2024-08-21T17:59:31Z) - Building Minimal and Reusable Causal State Abstractions for
Reinforcement Learning [63.58935783293342]
Causal Bisimulation Modeling (CBM) is a method that learns the causal relationships in the dynamics and reward functions for each task to derive a minimal, task-specific abstraction.
CBM's learned implicit dynamics models identify the underlying causal relationships and state abstractions more accurately than explicit ones.
arXiv Detail & Related papers (2024-01-23T05:43:15Z) - Hierarchical Imitation Learning with Vector Quantized Models [77.67190661002691]
We propose to use reinforcement learning to identify subgoals in expert trajectories.
We build a vector-quantized generative model for the identified subgoals to perform subgoal-level planning.
In experiments, the algorithm excels at solving complex, long-horizon decision-making problems outperforming state-of-the-art.
arXiv Detail & Related papers (2023-01-30T15:04:39Z) - Possibility Before Utility: Learning And Using Hierarchical Affordances [21.556661319375255]
Reinforcement learning algorithms struggle on tasks with complex hierarchical dependency structures.
We present Hierarchical Affordance Learning (HAL), a method that learns a model of hierarchical affordances in order to prune impossible subtasks for more effective learning.
arXiv Detail & Related papers (2022-03-23T19:17:22Z) - Temporal Abstractions-Augmented Temporally Contrastive Learning: An
Alternative to the Laplacian in RL [140.12803111221206]
In reinforcement learning, the graph Laplacian has proved to be a valuable tool in the task-agnostic setting.
We propose an alternative method that is able to recover, in a non-uniform-prior setting, the expressiveness and the desired properties of the Laplacian representation.
We find that our method succeeds as an alternative to the Laplacian in the non-uniform setting and scales to challenging continuous control environments.
arXiv Detail & Related papers (2022-03-21T22:07:48Z) - Learning Transferable Motor Skills with Hierarchical Latent Mixture
Policies [37.09286945259353]
We propose an approach to learn abstract motor skills from data using a hierarchical mixture latent variable model.
We demonstrate in manipulation domains that the method can effectively cluster offline data into distinct, executable behaviours.
arXiv Detail & Related papers (2021-12-09T17:37:14Z) - Hierarchical Skills for Efficient Exploration [70.62309286348057]
In reinforcement learning, pre-trained low-level skills have the potential to greatly facilitate exploration.
Prior knowledge of the downstream task is required to strike the right balance between generality (fine-grained control) and specificity (faster learning) in skill design.
We propose a hierarchical skill learning framework that acquires skills of varying complexity in an unsupervised manner.
arXiv Detail & Related papers (2021-10-20T22:29:32Z) - Landmark Policy Optimization for Object Navigation Task [77.34726150561087]
This work studies object goal navigation task, which involves navigating to the closest object related to the given semantic category in unseen environments.
Recent works have shown significant achievements both in the end-to-end Reinforcement Learning approach and modular systems, but need a big step forward to be robust and optimal.
We propose a hierarchical method that incorporates standard task formulation and additional area knowledge as landmarks, with a way to extract these landmarks.
arXiv Detail & Related papers (2021-09-17T12:28:46Z) - Low-Dimensional State and Action Representation Learning with MDP
Homomorphism Metrics [1.5293427903448022]
Deep Reinforcement Learning has shown its ability in solving complicated problems directly from high-dimensional observations.
In end-to-end settings, Reinforcement Learning algorithms are not sample-efficient and requires long training times and quantities of data.
We propose a framework for sample-efficient Reinforcement Learning that take advantage of state and action representations to transform a high-dimensional problem into a low-dimensional one.
arXiv Detail & Related papers (2021-07-04T16:26:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.