Feedback in Imitation Learning: Confusion on Causality and Covariate
Shift
- URL: http://arxiv.org/abs/2102.02872v1
- Date: Thu, 4 Feb 2021 20:18:56 GMT
- Title: Feedback in Imitation Learning: Confusion on Causality and Covariate
Shift
- Authors: Jonathan Spencer, Sanjiban Choudhury, Arun Venkatraman, Brian Ziebart,
J. Andrew Bagnell
- Abstract summary: We argue that conditioning policies on previous actions leads to a dramatic divergence between "held out" error and performance of the learner in situ.
We analyze existing benchmarks used to test imitation learning approaches.
We find, in a surprising contrast with previous literature, that naive behavioral cloning provides excellent results.
- Score: 12.93527098342393
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Imitation learning practitioners have often noted that conditioning policies
on previous actions leads to a dramatic divergence between "held out" error and
performance of the learner in situ. Interactive approaches can provably address
this divergence but require repeated querying of a demonstrator. Recent work
identifies this divergence as stemming from a "causal confound" in predicting
the current action, and seek to ablate causal aspects of current state using
tools from causal inference. In this work, we argue instead that this
divergence is simply another manifestation of covariate shift, exacerbated
particularly by settings of feedback between decisions and input features. The
learner often comes to rely on features that are strongly predictive of
decisions, but are subject to strong covariate shift.
Our work demonstrates a broad class of problems where this shift can be
mitigated, both theoretically and practically, by taking advantage of a
simulator but without any further querying of expert demonstration. We analyze
existing benchmarks used to test imitation learning approaches and find that
these benchmarks are realizable and simple and thus insufficient for capturing
the harder regimes of error compounding seen in real-world decision making
problems. We find, in a surprising contrast with previous literature, but
consistent with our theory, that naive behavioral cloning provides excellent
results. We detail the need for new standardized benchmarks that capture the
phenomena seen in robotics problems.
Related papers
- Towards Non-Adversarial Algorithmic Recourse [20.819764720587646]
It has been argued that adversarial examples, as opposed to counterfactual explanations, have a unique characteristic in that they lead to a misclassification compared to the ground truth.
We introduce non-adversarial algorithmic recourse and outline why in high-stakes situations, it is imperative to obtain counterfactual explanations that do not exhibit adversarial characteristics.
arXiv Detail & Related papers (2024-03-15T14:18:21Z) - On the Dynamics Under the Unhinged Loss and Beyond [104.49565602940699]
We introduce the unhinged loss, a concise loss function, that offers more mathematical opportunities to analyze closed-form dynamics.
The unhinged loss allows for considering more practical techniques, such as time-vary learning rates and feature normalization.
arXiv Detail & Related papers (2023-12-13T02:11:07Z) - Sim-to-Real Causal Transfer: A Metric Learning Approach to
Causally-Aware Interaction Representations [62.48505112245388]
We take an in-depth look at the causal awareness of modern representations of agent interactions.
We show that recent representations are already partially resilient to perturbations of non-causal agents.
We propose a metric learning approach that regularizes latent representations with causal annotations.
arXiv Detail & Related papers (2023-12-07T18:57:03Z) - UNcommonsense Reasoning: Abductive Reasoning about Uncommon Situations [62.71847873326847]
We investigate the ability to model unusual, unexpected, and unlikely situations.
Given a piece of context with an unexpected outcome, this task requires reasoning abductively to generate an explanation.
We release a new English language corpus called UNcommonsense.
arXiv Detail & Related papers (2023-11-14T19:00:55Z) - On Continuity of Robust and Accurate Classifiers [3.8673630752805437]
It has been shown that adversarial training can improve the robustness of the hypothesis.
It has been suggested that robustness and accuracy of a hypothesis are at odds with each other.
In this paper, we put forth the alternative proposal that it is the continuity of a hypothesis that is incompatible with its robustness and accuracy.
arXiv Detail & Related papers (2023-09-29T08:14:25Z) - Causal Triplet: An Open Challenge for Intervention-centric Causal
Representation Learning [98.78136504619539]
Causal Triplet is a causal representation learning benchmark featuring visually more complex scenes.
We show that models built with the knowledge of disentangled or object-centric representations significantly outperform their distributed counterparts.
arXiv Detail & Related papers (2023-01-12T17:43:38Z) - Deconfounding Imitation Learning with Variational Inference [19.99248795957195]
Standard imitation learning can fail when the expert demonstrators have different sensory inputs than the imitating agent.
This is because partial observability gives rise to hidden confounders in the causal graph.
We propose to train a variational inference model to infer the expert's latent information and use it to train a latent-conditional policy.
arXiv Detail & Related papers (2022-11-04T18:00:02Z) - Covariate Shift in High-Dimensional Random Feature Regression [44.13449065077103]
Covariate shift is a significant obstacle in the development of robust machine learning models.
We present a theoretical understanding in context of modern machine learning.
arXiv Detail & Related papers (2021-11-16T05:23:28Z) - Adversarial Robustness with Semi-Infinite Constrained Learning [177.42714838799924]
Deep learning to inputs perturbations has raised serious questions about its use in safety-critical domains.
We propose a hybrid Langevin Monte Carlo training approach to mitigate this issue.
We show that our approach can mitigate the trade-off between state-of-the-art performance and robust robustness.
arXiv Detail & Related papers (2021-10-29T13:30:42Z) - Fighting Copycat Agents in Behavioral Cloning from Observation Histories [85.404120663644]
Imitation learning trains policies to map from input observations to the actions that an expert would choose.
We propose an adversarial approach to learn a feature representation that removes excess information about the previous expert action nuisance correlate.
arXiv Detail & Related papers (2020-10-28T10:52:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.