LS3: Latent Space Safe Sets for Long-Horizon Visuomotor Control of
Iterative Tasks
- URL: http://arxiv.org/abs/2107.04775v1
- Date: Sat, 10 Jul 2021 06:46:10 GMT
- Title: LS3: Latent Space Safe Sets for Long-Horizon Visuomotor Control of
Iterative Tasks
- Authors: Albert Wilcox and Ashwin Balakrishna and Brijen Thananjeyan and Joseph
E. Gonzalez and Ken Goldberg
- Abstract summary: Reinforcement learning algorithms have shown impressive success in exploring high-dimensional environments to learn complex, long-horizon tasks.
A promising strategy for safe learning in dynamically uncertain environments is requiring that the agent can robustly return to states where task success can be guaranteed.
We present Latent Space Safe Sets (LS3), which extends this strategy to iterative, long-horizon tasks with image observations.
- Score: 28.287631944795823
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reinforcement learning (RL) algorithms have shown impressive success in
exploring high-dimensional environments to learn complex, long-horizon tasks,
but can often exhibit unsafe behaviors and require extensive environment
interaction when exploration is unconstrained. A promising strategy for safe
learning in dynamically uncertain environments is requiring that the agent can
robustly return to states where task success (and therefore safety) can be
guaranteed. While this approach has been successful in low-dimensions,
enforcing this constraint in environments with high-dimensional state spaces,
such as images, is challenging. We present Latent Space Safe Sets (LS3), which
extends this strategy to iterative, long-horizon tasks with image observations
by using suboptimal demonstrations and a learned dynamics model to restrict
exploration to the neighborhood of a learned Safe Set where task completion is
likely. We evaluate LS3 on 4 domains, including a challenging sequential
pushing task in simulation and a physical cable routing task. We find that LS3
can use prior task successes to restrict exploration and learn more efficiently
than prior algorithms while satisfying constraints. See
https://tinyurl.com/latent-ss for code and supplementary material.
Related papers
- Spatial Reasoning and Planning for Deep Embodied Agents [2.7195102129095003]
This thesis explores the development of data-driven techniques for spatial reasoning and planning tasks.
It focuses on enhancing learning efficiency, interpretability, and transferability across novel scenarios.
arXiv Detail & Related papers (2024-09-28T23:05:56Z) - Safe Guaranteed Exploration for Non-linear Systems [44.2908666969021]
We propose a novel safe guaranteed exploration framework using optimal control, which achieves first-of-its-kind results.
Based on this framework we propose an efficient algorithm, SageMPC, SAfe Guaranteed Exploration using Model Predictive Control.
We demonstrate safe efficient exploration in challenging unknown environments using SageMPC with a car model.
arXiv Detail & Related papers (2024-02-09T17:26:26Z) - Mission-driven Exploration for Accelerated Deep Reinforcement Learning
with Temporal Logic Task Specifications [11.812602599752294]
We consider robots with unknown dynamics operating in environments with unknown structure.
Our goal is to synthesize a control policy that maximizes the probability of satisfying an automaton-encoded task.
We propose a novel DRL algorithm, which has the capability to learn control policies at a notably faster rate compared to similar methods.
arXiv Detail & Related papers (2023-11-28T18:59:58Z) - Generalizable Long-Horizon Manipulations with Large Language Models [91.740084601715]
This work introduces a framework harnessing the capabilities of Large Language Models (LLMs) to generate primitive task conditions for generalizable long-horizon manipulations.
We create a challenging robotic manipulation task suite based on Pybullet for long-horizon task evaluation.
arXiv Detail & Related papers (2023-10-03T17:59:46Z) - Latent Exploration for Reinforcement Learning [87.42776741119653]
In Reinforcement Learning, agents learn policies by exploring and interacting with the environment.
We propose LATent TIme-Correlated Exploration (Lattice), a method to inject temporally-correlated noise into the latent state of the policy network.
arXiv Detail & Related papers (2023-05-31T17:40:43Z) - Skill-based Meta-Reinforcement Learning [65.31995608339962]
We devise a method that enables meta-learning on long-horizon, sparse-reward tasks.
Our core idea is to leverage prior experience extracted from offline datasets during meta-learning.
arXiv Detail & Related papers (2022-04-25T17:58:19Z) - Successor Feature Landmarks for Long-Horizon Goal-Conditioned
Reinforcement Learning [54.378444600773875]
We introduce Successor Feature Landmarks (SFL), a framework for exploring large, high-dimensional environments.
SFL drives exploration by estimating state-novelty and enables high-level planning by abstracting the state-space as a non-parametric landmark-based graph.
We show in our experiments on MiniGrid and ViZDoom that SFL enables efficient exploration of large, high-dimensional state spaces.
arXiv Detail & Related papers (2021-11-18T18:36:05Z) - Discovering and Exploiting Sparse Rewards in a Learned Behavior Space [0.46736439782713946]
Learning optimal policies in sparse rewards settings is difficult as the learning agent has little to no feedback on the quality of its actions.
We introduce STAX, an algorithm designed to learn a behavior space on-the-fly and to explore it while efficiently optimizing any reward discovered.
arXiv Detail & Related papers (2021-11-02T22:21:11Z) - Batch Exploration with Examples for Scalable Robotic Reinforcement
Learning [63.552788688544254]
Batch Exploration with Examples (BEE) explores relevant regions of the state-space guided by a modest number of human provided images of important states.
BEE is able to tackle challenging vision-based manipulation tasks both in simulation and on a real Franka robot.
arXiv Detail & Related papers (2020-10-22T17:49:25Z) - Broadly-Exploring, Local-Policy Trees for Long-Horizon Task Planning [12.024736761925864]
Long-horizon planning in realistic environments requires the ability to reason over sequential tasks in high-dimensional state spaces.
We present Broadly-Exploring-Local-policy Trees (BELT), a task-conditioned, model-based tree search.
BELT is demonstrated experimentally to be able to plan long-horizon, sequential with a goal conditioned policy and generate plans that are robust.
arXiv Detail & Related papers (2020-10-13T15:51:24Z) - Weakly-Supervised Reinforcement Learning for Controllable Behavior [126.04932929741538]
Reinforcement learning (RL) is a powerful framework for learning to take actions to solve tasks.
In many settings, an agent must winnow down the inconceivably large space of all possible tasks to the single task that it is currently being asked to solve.
We introduce a framework for using weak supervision to automatically disentangle this semantically meaningful subspace of tasks from the enormous space of nonsensical "chaff" tasks.
arXiv Detail & Related papers (2020-04-06T17:50:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.