Robust Maximum Entropy Behavior Cloning
- URL: http://arxiv.org/abs/2101.01251v1
- Date: Mon, 4 Jan 2021 22:08:46 GMT
- Title: Robust Maximum Entropy Behavior Cloning
- Authors: Mostafa Hussein, Brendan Crowe, Marek Petrik and Momotaz Begum
- Abstract summary: Imitation learning (IL) algorithms use expert demonstrations to learn a specific task.
Most of the existing approaches assume that all expert demonstrations are reliable and trustworthy, but what if there exist some adversarial demonstrations among the given data-set?
We propose a novel general frame-work to directly generate a policy from demonstrations that autonomously detect the adversarial demonstrations and exclude them from the data set.
- Score: 15.713997170792842
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Imitation learning (IL) algorithms use expert demonstrations to learn a
specific task. Most of the existing approaches assume that all expert
demonstrations are reliable and trustworthy, but what if there exist some
adversarial demonstrations among the given data-set? This may result in poor
decision-making performance. We propose a novel general frame-work to directly
generate a policy from demonstrations that autonomously detect the adversarial
demonstrations and exclude them from the data set. At the same time, it's
sample, time-efficient, and does not require a simulator. To model such
adversarial demonstration we propose a min-max problem that leverages the
entropy of the model to assign weights for each demonstration. This allows us
to learn the behavior using only the correct demonstrations or a mixture of
correct demonstrations.
Related papers
- Unlabeled Imperfect Demonstrations in Adversarial Imitation Learning [48.595574101874575]
In the real world, expert demonstrations are more likely to be imperfect.
A positive-unlabeled adversarial imitation learning algorithm is developed.
Agent policy will be optimized to cheat the discriminator and produce trajectories similar to those optimal expert demonstrations.
arXiv Detail & Related papers (2023-02-13T11:26:44Z) - Program Generation from Diverse Video Demonstrations [49.202289347899836]
Generalising over multiple observations is a task that has historically presented difficulties for machines to grasp.
We propose a model that can extract general rules from video demonstrations by simultaneously performing summarisation and translation.
arXiv Detail & Related papers (2023-02-01T01:51:45Z) - Out-of-Dynamics Imitation Learning from Multimodal Demonstrations [68.46458026983409]
We study out-of-dynamics imitation learning (OOD-IL), which relaxes the assumption to that the demonstrator and the imitator have the same state spaces.
OOD-IL enables imitation learning to utilize demonstrations from a wide range of demonstrators but introduces a new challenge.
We develop a better transferability measurement to tackle this newly-emerged challenge.
arXiv Detail & Related papers (2022-11-13T07:45:06Z) - Robustness of Demonstration-based Learning Under Limited Data Scenario [54.912936555876826]
Demonstration-based learning has shown great potential in stimulating pretrained language models' ability under limited data scenario.
Why such demonstrations are beneficial for the learning process remains unclear since there is no explicit alignment between the demonstrations and the predictions.
In this paper, we design pathological demonstrations by gradually removing intuitively useful information from the standard ones to take a deep dive of the robustness of demonstration-based sequence labeling.
arXiv Detail & Related papers (2022-10-19T16:15:04Z) - Extraneousness-Aware Imitation Learning [25.60384350984274]
Extraneousness-Aware Learning (EIL) learns visuomotor policies from third-person demonstrations with extraneous subsequences.
EIL learns action-conditioned observation embeddings in a self-supervised manner and retrieves task-relevant observations across visual demonstrations.
Experimental results show that EIL outperforms strong baselines and achieves comparable policies to those trained with perfect demonstration.
arXiv Detail & Related papers (2022-10-04T04:42:26Z) - Evaluating the Effectiveness of Corrective Demonstrations and a Low-Cost
Sensor for Dexterous Manipulation [0.5669790037378094]
Imitation learning is a promising approach to help robots acquire dexterous manipulation capabilities.
We investigate characteristics of such additional demonstrations and their impact on performance.
We show that inexpensive vision-based sensors, such as LeapMotion, can be used to dramatically reduce the cost of providing demonstrations.
arXiv Detail & Related papers (2022-04-15T19:55:46Z) - Contrastive Demonstration Tuning for Pre-trained Language Models [59.90340768724675]
Demonstration examples are crucial for an excellent final performance of prompt-tuning.
The proposed approach can be: (i) Plugged into any previous prompt-tuning approaches; (ii) Extended to widespread classification tasks with a large number of categories.
Experimental results on 16 datasets illustrate that our method integrated with previous approaches LM-BFF and P-tuning can yield better performance.
arXiv Detail & Related papers (2022-04-09T05:30:48Z) - Learning from Imperfect Demonstrations from Agents with Varying Dynamics [29.94164262533282]
We develop a metric composed of a feasibility score and an optimality score to measure how useful a demonstration is for imitation learning.
Our experiments on four environments in simulation and on a real robot show improved learned policies with higher expected return.
arXiv Detail & Related papers (2021-03-10T07:39:38Z) - Learning to Shift Attention for Motion Generation [55.61994201686024]
One challenge of motion generation using robot learning from demonstration techniques is that human demonstrations follow a distribution with multiple modes for one task query.
Previous approaches fail to capture all modes or tend to average modes of the demonstrations and thus generate invalid trajectories.
We propose a motion generation model with extrapolation ability to overcome this problem.
arXiv Detail & Related papers (2021-02-24T09:07:52Z) - Reinforcement Learning with Supervision from Noisy Demonstrations [38.00968774243178]
We propose a novel framework to adaptively learn the policy by jointly interacting with the environment and exploiting the expert demonstrations.
Experimental results in various environments with multiple popular reinforcement learning algorithms show that the proposed approach can learn robustly with noisy demonstrations.
arXiv Detail & Related papers (2020-06-14T06:03:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.