SAFARI: Safe and Active Robot Imitation Learning with Imagination
- URL: http://arxiv.org/abs/2011.09586v1
- Date: Wed, 18 Nov 2020 23:43:59 GMT
- Title: SAFARI: Safe and Active Robot Imitation Learning with Imagination
- Authors: Norman Di Palo, Edward Johns
- Abstract summary: SAFARI is a novel active learning and control algorithm.
It allows an agent to request further human demonstrations when these out-of-distribution situations are met.
We show how this method enables the agent to autonomously predict failure rapidly and safely.
- Score: 16.967930721746676
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: One of the main issues in Imitation Learning is the erroneous behavior of an
agent when facing out-of-distribution situations, not covered by the set of
demonstrations given by the expert. In this work, we tackle this problem by
introducing a novel active learning and control algorithm, SAFARI. During
training, it allows an agent to request further human demonstrations when these
out-of-distribution situations are met. At deployment, it combines model-free
acting using behavioural cloning with model-based planning to reduce
state-distribution shift, using future state reconstruction as a test for state
familiarity. We empirically demonstrate how this method increases the
performance on a set of manipulation tasks with respect to passive Imitation
Learning, by gathering more informative demonstrations and by minimizing
state-distribution shift at test time. We also show how this method enables the
agent to autonomously predict failure rapidly and safely.
Related papers
- Diffusion Imitation from Observation [4.205946699819021]
adversarial imitation learning approaches learn a generator agent policy to produce state transitions that are indistinguishable to a discriminator.
Motivated by the recent success of diffusion models in generative modeling, we propose to integrate a diffusion model into the adversarial imitation learning from observation framework.
arXiv Detail & Related papers (2024-10-07T18:49:55Z) - Self-Supervised Adversarial Imitation Learning [20.248498544165184]
Behavioural cloning teaches an agent how to behave via expert demonstrations.
Recent approaches use self-supervision of fully-observable unlabelled snapshots of the states to decode state pairs into actions.
Previous work uses goal-aware strategies to solve this issue.
We address this limitation by incorporating a discriminator into the original framework.
arXiv Detail & Related papers (2023-04-21T12:12:33Z) - Imitating, Fast and Slow: Robust learning from demonstrations via
decision-time planning [96.72185761508668]
Planning at Test-time (IMPLANT) is a new meta-algorithm for imitation learning.
We demonstrate that IMPLANT significantly outperforms benchmark imitation learning approaches on standard control environments.
arXiv Detail & Related papers (2022-04-07T17:16:52Z) - Persistent Reinforcement Learning via Subgoal Curricula [114.83989499740193]
Value-accelerated Persistent Reinforcement Learning (VaPRL) generates a curriculum of initial states.
VaPRL reduces the interventions required by three orders of magnitude compared to episodic reinforcement learning.
arXiv Detail & Related papers (2021-07-27T16:39:45Z) - Visual Adversarial Imitation Learning using Variational Models [60.69745540036375]
Reward function specification remains a major impediment for learning behaviors through deep reinforcement learning.
Visual demonstrations of desired behaviors often presents an easier and more natural way to teach agents.
We develop a variational model-based adversarial imitation learning algorithm.
arXiv Detail & Related papers (2021-07-16T00:15:18Z) - Feature-Based Interpretable Reinforcement Learning based on
State-Transition Models [3.883460584034766]
Growing concerns regarding the operational usage of AI models in the real-world has caused a surge of interest in explaining AI models' decisions to humans.
We propose a method for offering local explanations on risk in reinforcement learning.
arXiv Detail & Related papers (2021-05-14T23:43:11Z) - PsiPhi-Learning: Reinforcement Learning with Demonstrations using
Successor Features and Inverse Temporal Difference Learning [102.36450942613091]
We propose an inverse reinforcement learning algorithm, called emphinverse temporal difference learning (ITD)
We show how to seamlessly integrate ITD with learning from online environment interactions, arriving at a novel algorithm for reinforcement learning with demonstrations, called $Psi Phi$-learning.
arXiv Detail & Related papers (2021-02-24T21:12:09Z) - Human-in-the-Loop Imitation Learning using Remote Teleoperation [72.2847988686463]
We build a data collection system tailored to 6-DoF manipulation settings.
We develop an algorithm to train the policy iteratively on new data collected by the system.
We demonstrate that agents trained on data collected by our intervention-based system and algorithm outperform agents trained on an equivalent number of samples collected by non-interventional demonstrators.
arXiv Detail & Related papers (2020-12-12T05:30:35Z) - Learn to Exceed: Stereo Inverse Reinforcement Learning with Concurrent
Policy Optimization [1.0965065178451106]
We study the problem of obtaining a control policy that can mimic and then outperform expert demonstrations in Markov decision processes.
One main relevant approach is the inverse reinforcement learning (IRL), which mainly focuses on inferring a reward function from expert demonstrations.
We propose a novel method that enables the learning agent to outperform the demonstrator via a new concurrent reward and action policy learning approach.
arXiv Detail & Related papers (2020-09-21T02:16:21Z) - Interactive Imitation Learning in State-Space [5.672132510411464]
We propose a novel Interactive Learning technique that uses human feedback in state-space to train and improve agent behavior.
Our method titled Teaching Imitative Policies in State-space(TIPS) enables providing guidance to the agent in terms of changing its state'
arXiv Detail & Related papers (2020-08-02T17:23:54Z) - State-Only Imitation Learning for Dexterous Manipulation [63.03621861920732]
In this paper, we explore state-only imitation learning.
We train an inverse dynamics model and use it to predict actions for state-only demonstrations.
Our method performs on par with state-action approaches and considerably outperforms RL alone.
arXiv Detail & Related papers (2020-04-07T17:57:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.