Related papers: Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon Reasoning

Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon Reasoning

URL: http://arxiv.org/abs/2111.03189v1
Date: Thu, 4 Nov 2021 22:46:16 GMT
Title: Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon Reasoning
Authors: Dhruv Shah, Peng Xu, Yao Lu, Ted Xiao, Alexander Toshev, Sergey Levine, Brian Ichter
Abstract summary: Reinforcement learning can train policies that effectively perform complex tasks. For long-horizon tasks, the performance of these methods degrades with horizon, often necessitating reasoning over and composing lower-level skills. We propose Value Function Spaces: a simple approach that produces such a representation by using the value functions corresponding to each lower-level skill.
Score: 120.38381203153159
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Reinforcement learning can train policies that effectively perform complex tasks. However for long-horizon tasks, the performance of these methods degrades with horizon, often necessitating reasoning over and composing lower-level skills. Hierarchical reinforcement learning aims to enable this by providing a bank of low-level skills as action abstractions. Hierarchies can further improve on this by abstracting the space states as well. We posit that a suitable state abstraction should depend on the capabilities of the available lower-level policies. We propose Value Function Spaces: a simple approach that produces such a representation by using the value functions corresponding to each lower-level skill. These value functions capture the affordances of the scene, thus forming a representation that compactly abstracts task relevant information and robustly ignores distractors. Empirical evaluations for maze-solving and robotic manipulation tasks demonstrate that our approach improves long-horizon performance and enables better zero-shot generalization than alternative model-free and model-based methods.

Related papers

Efficient Exploration and Discriminative World Model Learning with an Object-Centric Abstraction [19.59151245929067]
We study whether giving an agent an object-centric mapping (describing a set of items and their attributes) allow for more efficient learning. We find this problem is best solved hierarchically by modelling items at a higher level of state abstraction to pixels. We make use of this to propose a fully model-based algorithm that learns a discriminative world model.
arXiv Detail & Related papers (2024-08-21T17:59:31Z)
Building Minimal and Reusable Causal State Abstractions for Reinforcement Learning [63.58935783293342]
Causal Bisimulation Modeling (CBM) is a method that learns the causal relationships in the dynamics and reward functions for each task to derive a minimal, task-specific abstraction. CBM's learned implicit dynamics models identify the underlying causal relationships and state abstractions more accurately than explicit ones.
arXiv Detail & Related papers (2024-01-23T05:43:15Z)
Hierarchical Imitation Learning with Vector Quantized Models [77.67190661002691]
We propose to use reinforcement learning to identify subgoals in expert trajectories. We build a vector-quantized generative model for the identified subgoals to perform subgoal-level planning. In experiments, the algorithm excels at solving complex, long-horizon decision-making problems outperforming state-of-the-art.
arXiv Detail & Related papers (2023-01-30T15:04:39Z)
Possibility Before Utility: Learning And Using Hierarchical Affordances [21.556661319375255]
Reinforcement learning algorithms struggle on tasks with complex hierarchical dependency structures. We present Hierarchical Affordance Learning (HAL), a method that learns a model of hierarchical affordances in order to prune impossible subtasks for more effective learning.
arXiv Detail & Related papers (2022-03-23T19:17:22Z)
Temporal Abstractions-Augmented Temporally Contrastive Learning: An Alternative to the Laplacian in RL [140.12803111221206]
In reinforcement learning, the graph Laplacian has proved to be a valuable tool in the task-agnostic setting. We propose an alternative method that is able to recover, in a non-uniform-prior setting, the expressiveness and the desired properties of the Laplacian representation. We find that our method succeeds as an alternative to the Laplacian in the non-uniform setting and scales to challenging continuous control environments.
arXiv Detail & Related papers (2022-03-21T22:07:48Z)
Learning Transferable Motor Skills with Hierarchical Latent Mixture Policies [37.09286945259353]
We propose an approach to learn abstract motor skills from data using a hierarchical mixture latent variable model. We demonstrate in manipulation domains that the method can effectively cluster offline data into distinct, executable behaviours.
arXiv Detail & Related papers (2021-12-09T17:37:14Z)
Hierarchical Skills for Efficient Exploration [70.62309286348057]
In reinforcement learning, pre-trained low-level skills have the potential to greatly facilitate exploration. Prior knowledge of the downstream task is required to strike the right balance between generality (fine-grained control) and specificity (faster learning) in skill design. We propose a hierarchical skill learning framework that acquires skills of varying complexity in an unsupervised manner.
arXiv Detail & Related papers (2021-10-20T22:29:32Z)
Landmark Policy Optimization for Object Navigation Task [77.34726150561087]
This work studies object goal navigation task, which involves navigating to the closest object related to the given semantic category in unseen environments. Recent works have shown significant achievements both in the end-to-end Reinforcement Learning approach and modular systems, but need a big step forward to be robust and optimal. We propose a hierarchical method that incorporates standard task formulation and additional area knowledge as landmarks, with a way to extract these landmarks.
arXiv Detail & Related papers (2021-09-17T12:28:46Z)
Low-Dimensional State and Action Representation Learning with MDP Homomorphism Metrics [1.5293427903448022]
Deep Reinforcement Learning has shown its ability in solving complicated problems directly from high-dimensional observations. In end-to-end settings, Reinforcement Learning algorithms are not sample-efficient and requires long training times and quantities of data. We propose a framework for sample-efficient Reinforcement Learning that take advantage of state and action representations to transform a high-dimensional problem into a low-dimensional one.
arXiv Detail & Related papers (2021-07-04T16:26:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.