InterPReT: Interactive Policy Restructuring and Training Enable Effective Imitation Learning from Laypersons
- URL: http://arxiv.org/abs/2602.04213v1
- Date: Wed, 04 Feb 2026 04:55:10 GMT
- Title: InterPReT: Interactive Policy Restructuring and Training Enable Effective Imitation Learning from Laypersons
- Authors: Feiyu Gavin Zhu, Jean Oh, Reid Simmons,
- Abstract summary: We propose Interactive Policy Restructuring and Training (InterPReT), which takes user instructions to continually update the policy structure.<n>This enables end-users to interactively give instructions and demonstrations, monitor the agent's performance, and review the agent's decision-making strategies.
- Score: 10.214431946148162
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Imitation learning has shown success in many tasks by learning from expert demonstrations. However, most existing work relies on large-scale demonstrations from technical professionals and close monitoring of the training process. These are challenging for a layperson when they want to teach the agent new skills. To lower the barrier of teaching AI agents, we propose Interactive Policy Restructuring and Training (InterPReT), which takes user instructions to continually update the policy structure and optimize its parameters to fit user demonstrations. This enables end-users to interactively give instructions and demonstrations, monitor the agent's performance, and review the agent's decision-making strategies. A user study (N=34) on teaching an AI agent to drive in a racing game confirms that our approach yields more robust policies without impairing system usability, compared to a generic imitation learning baseline, when a layperson is responsible for both giving demonstrations and determining when to stop. This shows that our method is more suitable for end-users without much technical background in machine learning to train a dependable policy
Related papers
- RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning.
Our proposed method uses reinforcement learning with user intervention signals themselves as rewards.
This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z) - A Survey of Demonstration Learning [0.0]
Demonstration Learning is a paradigm in which an agent learns to perform a task by imitating the behavior of an expert shown in demonstrations.
It is gaining significant traction due to having tremendous potential for learning complex behaviors from demonstrations.
Due to learning without interacting with the environment, demonstration learning would allow the automation of a wide range of real world applications such as robotics and healthcare.
arXiv Detail & Related papers (2023-03-20T15:22:10Z) - Accelerating Self-Imitation Learning from Demonstrations via Policy
Constraints and Q-Ensemble [6.861783783234304]
We propose a learning from demonstrations method named A-SILfD.
A-SILfD treats expert demonstrations as the agent's successful experiences and uses experiences to constrain policy improvement.
In four Mujoco continuous control tasks, A-SILfD can significantly outperform baseline methods after 150,000 steps of online training.
arXiv Detail & Related papers (2022-12-07T10:29:13Z) - PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via
Relabeling Experience and Unsupervised Pre-training [94.87393610927812]
We present an off-policy, interactive reinforcement learning algorithm that capitalizes on the strengths of both feedback and off-policy learning.
We demonstrate that our approach is capable of learning tasks of higher complexity than previously considered by human-in-the-loop methods.
arXiv Detail & Related papers (2021-06-09T14:10:50Z) - PsiPhi-Learning: Reinforcement Learning with Demonstrations using
Successor Features and Inverse Temporal Difference Learning [102.36450942613091]
We propose an inverse reinforcement learning algorithm, called emphinverse temporal difference learning (ITD)
We show how to seamlessly integrate ITD with learning from online environment interactions, arriving at a novel algorithm for reinforcement learning with demonstrations, called $Psi Phi$-learning.
arXiv Detail & Related papers (2021-02-24T21:12:09Z) - Human-in-the-Loop Imitation Learning using Remote Teleoperation [72.2847988686463]
We build a data collection system tailored to 6-DoF manipulation settings.
We develop an algorithm to train the policy iteratively on new data collected by the system.
We demonstrate that agents trained on data collected by our intervention-based system and algorithm outperform agents trained on an equivalent number of samples collected by non-interventional demonstrators.
arXiv Detail & Related papers (2020-12-12T05:30:35Z) - Rethinking Supervised Learning and Reinforcement Learning in
Task-Oriented Dialogue Systems [58.724629408229205]
We demonstrate how traditional supervised learning and a simulator-free adversarial learning method can be used to achieve performance comparable to state-of-the-art RL-based methods.
Our main goal is not to beat reinforcement learning with supervised learning, but to demonstrate the value of rethinking the role of reinforcement learning and supervised learning in optimizing task-oriented dialogue systems.
arXiv Detail & Related papers (2020-09-21T12:04:18Z) - Visual Imitation Made Easy [102.36509665008732]
We present an alternate interface for imitation that simplifies the data collection process while allowing for easy transfer to robots.
We use commercially available reacher-grabber assistive tools both as a data collection device and as the robot's end-effector.
We experimentally evaluate on two challenging tasks: non-prehensile pushing and prehensile stacking, with 1000 diverse demonstrations for each task.
arXiv Detail & Related papers (2020-08-11T17:58:50Z) - Interactive Imitation Learning in State-Space [5.672132510411464]
We propose a novel Interactive Learning technique that uses human feedback in state-space to train and improve agent behavior.
Our method titled Teaching Imitative Policies in State-space(TIPS) enables providing guidance to the agent in terms of changing its state'
arXiv Detail & Related papers (2020-08-02T17:23:54Z) - Constrained-Space Optimization and Reinforcement Learning for Complex
Tasks [42.648636742651185]
Learning from Demonstration is increasingly used for transferring operator manipulation skills to robots.
This paper presents a constrained-space optimization and reinforcement learning scheme for managing complex tasks.
arXiv Detail & Related papers (2020-04-01T21:50:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.