Robust Imitation Learning from Corrupted Demonstrations
- URL: http://arxiv.org/abs/2201.12594v1
- Date: Sat, 29 Jan 2022 14:21:28 GMT
- Title: Robust Imitation Learning from Corrupted Demonstrations
- Authors: Liu Liu, Ziyang Tang, Lanqing Li, Dijun Luo
- Abstract summary: We consider offline Imitation Learning from corrupted demonstrations where a constant fraction of data can be noise or even arbitrary outliers.
We propose a novel robust algorithm by minimizing a Median-of-Means (MOM) objective which guarantees the accurate estimation of policy.
Our experiments on continuous-control benchmarks validate that our method exhibits the predicted robustness and effectiveness.
- Score: 15.872598211059403
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider offline Imitation Learning from corrupted demonstrations where a
constant fraction of data can be noise or even arbitrary outliers. Classical
approaches such as Behavior Cloning assumes that demonstrations are collected
by an presumably optimal expert, hence may fail drastically when learning from
corrupted demonstrations. We propose a novel robust algorithm by minimizing a
Median-of-Means (MOM) objective which guarantees the accurate estimation of
policy, even in the presence of constant fraction of outliers. Our theoretical
analysis shows that our robust method in the corrupted setting enjoys nearly
the same error scaling and sample complexity guarantees as the classical
Behavior Cloning in the expert demonstration setting. Our experiments on
continuous-control benchmarks validate that our method exhibits the predicted
robustness and effectiveness, and achieves competitive results compared to
existing imitation learning methods.
Related papers
- Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.
We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z) - Observation-Guided Diffusion Probabilistic Models [41.749374023639156]
We propose a novel diffusion-based image generation method called the observation-guided diffusion probabilistic model (OGDM)
Our approach reestablishes the training objective by integrating the guidance of the observation process with the Markov chain.
We demonstrate the effectiveness of our training algorithm using diverse inference techniques on strong diffusion model baselines.
arXiv Detail & Related papers (2023-10-06T06:29:06Z) - Late Stopping: Avoiding Confidently Learning from Mislabeled Examples [61.00103151680946]
We propose a new framework, Late Stopping, which leverages the intrinsic robust learning ability of DNNs through a prolonged training process.
We empirically observe that mislabeled and clean examples exhibit differences in the number of epochs required for them to be consistently and correctly classified.
Experimental results on benchmark-simulated and real-world noisy datasets demonstrate that the proposed method outperforms state-of-the-art counterparts.
arXiv Detail & Related papers (2023-08-26T12:43:25Z) - Provable Guarantees for Generative Behavior Cloning: Bridging Low-Level
Stability and High-Level Behavior [51.60683890503293]
We propose a theoretical framework for studying behavior cloning of complex expert demonstrations using generative modeling.
We show that pure supervised cloning can generate trajectories matching the per-time step distribution of arbitrary expert trajectories.
arXiv Detail & Related papers (2023-07-27T04:27:26Z) - Imitating, Fast and Slow: Robust learning from demonstrations via
decision-time planning [96.72185761508668]
Planning at Test-time (IMPLANT) is a new meta-algorithm for imitation learning.
We demonstrate that IMPLANT significantly outperforms benchmark imitation learning approaches on standard control environments.
arXiv Detail & Related papers (2022-04-07T17:16:52Z) - Consistency Training with Virtual Adversarial Discrete Perturbation [17.311821099484987]
We propose an effective consistency training framework that enforces a training model's predictions given original and perturbed inputs to be similar.
This virtual adversarial discrete noise obtained by replacing a small portion of tokens efficiently pushes a training model's decision boundary.
arXiv Detail & Related papers (2021-04-15T07:49:43Z) - Robust Imitation Learning from Noisy Demonstrations [81.67837507534001]
We show that robust imitation learning can be achieved by optimizing a classification risk with a symmetric loss.
We propose a new imitation learning method that effectively combines pseudo-labeling with co-training.
Experimental results on continuous-control benchmarks show that our method is more robust compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-10-20T10:41:37Z) - Learning the Truth From Only One Side of the Story [58.65439277460011]
We focus on generalized linear models and show that without adjusting for this sampling bias, the model may converge suboptimally or even fail to converge to the optimal solution.
We propose an adaptive approach that comes with theoretical guarantees and show that it outperforms several existing methods empirically.
arXiv Detail & Related papers (2020-06-08T18:20:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.