How To Guide Your Learner: Imitation Learning with Active Adaptive
Expert Involvement
- URL: http://arxiv.org/abs/2303.02073v1
- Date: Fri, 3 Mar 2023 16:44:33 GMT
- Title: How To Guide Your Learner: Imitation Learning with Active Adaptive
Expert Involvement
- Authors: Xu-Hui Liu, Feng Xu, Xinyu Zhang, Tianyuan Liu, Shengyi Jiang, Ruifeng
Chen, Zongzhang Zhang, Yang Yu
- Abstract summary: We propose a novel active imitation learning framework based on a teacher-student interaction model.
We show that AdapMen can improve the error bound and avoid compounding error under mild conditions.
- Score: 20.91491585498749
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Imitation learning aims to mimic the behavior of experts without explicit
reward signals. Passive imitation learning methods which use static expert
datasets typically suffer from compounding error, low sample efficiency, and
high hyper-parameter sensitivity. In contrast, active imitation learning
methods solicit expert interventions to address the limitations. However,
recent active imitation learning methods are designed based on human intuitions
or empirical experience without theoretical guarantee. In this paper, we
propose a novel active imitation learning framework based on a teacher-student
interaction model, in which the teacher's goal is to identify the best teaching
behavior and actively affect the student's learning process. By solving the
optimization objective of this framework, we propose a practical
implementation, naming it AdapMen. Theoretical analysis shows that AdapMen can
improve the error bound and avoid compounding error under mild conditions.
Experiments on the MetaDrive benchmark and Atari 2600 games validate our
theoretical analysis and show that our method achieves near-expert performance
with much less expert involvement and total sampling steps than previous
methods. The code is available at https://github.com/liuxhym/AdapMen.
Related papers
- Unlearning with Control: Assessing Real-world Utility for Large Language Model Unlearning [97.2995389188179]
Recent research has begun to approach large language models (LLMs) unlearning via gradient ascent (GA)
Despite their simplicity and efficiency, we suggest that GA-based methods face the propensity towards excessive unlearning.
We propose several controlling methods that can regulate the extent of excessive unlearning.
arXiv Detail & Related papers (2024-06-13T14:41:00Z) - RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning.
Our proposed method uses reinforcement learning with user intervention signals themselves as rewards.
This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z) - Sample-efficient Adversarial Imitation Learning [45.400080101596956]
We propose a self-supervised representation-based adversarial imitation learning method to learn state and action representations.
We show a 39% relative improvement over existing adversarial imitation learning methods on MuJoCo in a setting limited to 100 expert state-action pairs.
arXiv Detail & Related papers (2023-03-14T12:36:01Z) - Causal Imitation Learning with Unobserved Confounders [82.22545916247269]
We study imitation learning when sensory inputs of the learner and the expert differ.
We show that imitation could still be feasible by exploiting quantitative knowledge of the expert trajectories.
arXiv Detail & Related papers (2022-08-12T13:29:53Z) - Imitating, Fast and Slow: Robust learning from demonstrations via
decision-time planning [96.72185761508668]
Planning at Test-time (IMPLANT) is a new meta-algorithm for imitation learning.
We demonstrate that IMPLANT significantly outperforms benchmark imitation learning approaches on standard control environments.
arXiv Detail & Related papers (2022-04-07T17:16:52Z) - TRAIL: Near-Optimal Imitation Learning with Suboptimal Data [100.83688818427915]
We present training objectives that use offline datasets to learn a factored transition model.
Our theoretical analysis shows that the learned latent action space can boost the sample-efficiency of downstream imitation learning.
To learn the latent action space in practice, we propose TRAIL (Transition-Reparametrized Actions for Imitation Learning), an algorithm that learns an energy-based transition model.
arXiv Detail & Related papers (2021-10-27T21:05:00Z) - RLTutor: Reinforcement Learning Based Adaptive Tutoring System by
Modeling Virtual Student with Fewer Interactions [10.34673089426247]
We propose a framework for optimizing teaching strategies by constructing a virtual model of the student.
Our results can serve as a buffer between theoretical instructional optimization and practical applications in e-learning systems.
arXiv Detail & Related papers (2021-07-31T15:42:03Z) - Robust Imitation Learning from Noisy Demonstrations [81.67837507534001]
We show that robust imitation learning can be achieved by optimizing a classification risk with a symmetric loss.
We propose a new imitation learning method that effectively combines pseudo-labeling with co-training.
Experimental results on continuous-control benchmarks show that our method is more robust compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-10-20T10:41:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.