Interaction-limited Inverse Reinforcement Learning
- URL: http://arxiv.org/abs/2007.00425v1
- Date: Wed, 1 Jul 2020 12:31:52 GMT
- Title: Interaction-limited Inverse Reinforcement Learning
- Authors: Martin Troussard, Emmanuel Pignat, Parameswaran Kamalaruban, Sylvain
Calinon, Volkan Cevher
- Abstract summary: We present two different training strategies: Curriculum Inverse Reinforcement Learning (CIRL) covering the teacher's perspective, and Self-Paced Inverse Reinforcement Learning (SPIRL) focusing on the learner's perspective.
Using experiments in simulations and experiments with a real robot learning a task from a human demonstrator, we show that our training strategies can allow a faster training than a random teacher for CIRL and than a batch learner for SPIRL.
- Score: 50.201765937436654
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes an inverse reinforcement learning (IRL) framework to
accelerate learning when the learner-teacher \textit{interaction} is
\textit{limited} during training. Our setting is motivated by the realistic
scenarios where a helpful teacher is not available or when the teacher cannot
access the learning dynamics of the student. We present two different training
strategies: Curriculum Inverse Reinforcement Learning (CIRL) covering the
teacher's perspective, and Self-Paced Inverse Reinforcement Learning (SPIRL)
focusing on the learner's perspective. Using experiments in simulations and
experiments with a real robot learning a task from a human demonstrator, we
show that our training strategies can allow a faster training than a random
teacher for CIRL and than a batch learner for SPIRL.
Related papers
- SPIRE: Synergistic Planning, Imitation, and Reinforcement Learning for Long-Horizon Manipulation [58.14969377419633]
We propose spire, a system that decomposes tasks into smaller learning subproblems and second combines imitation and reinforcement learning to maximize their strengths.
We find that spire outperforms prior approaches that integrate imitation learning, reinforcement learning, and planning by 35% to 50% in average task performance.
arXiv Detail & Related papers (2024-10-23T17:42:07Z) - RILe: Reinforced Imitation Learning [60.63173816209543]
RILe is a framework that combines the strengths of imitation learning and inverse reinforcement learning to learn a dense reward function efficiently.
Our framework produces high-performing policies in high-dimensional tasks where direct imitation fails to replicate complex behaviors.
arXiv Detail & Related papers (2024-06-12T17:56:31Z) - Learn to Teach: Improve Sample Efficiency in Teacher-student Learning
for Sim-to-Real Transfer [5.731477362725785]
We propose a sample efficient learning framework termed Learn to Teach (L2T) that recycles experience collected by the teacher agent.
We show that a single-loop algorithm can train both the teacher and student agents under both Reinforcement Learning and Inverse Reinforcement Learning contexts.
arXiv Detail & Related papers (2024-02-09T21:16:43Z) - YODA: Teacher-Student Progressive Learning for Language Models [82.0172215948963]
This paper introduces YODA, a teacher-student progressive learning framework.
It emulates the teacher-student education process to improve the efficacy of model fine-tuning.
Experiments show that training LLaMA2 with data from YODA improves SFT with significant performance gain.
arXiv Detail & Related papers (2024-01-28T14:32:15Z) - Teacher-student curriculum learning for reinforcement learning [1.7259824817932292]
Reinforcement learning (rl) is a popular paradigm for sequential decision making problems.
The sample inefficiency of deep reinforcement learning methods is a significant obstacle when applying rl to real-world problems.
We propose a teacher-student curriculum learning setting where we simultaneously train a teacher that selects tasks for the student while the student learns how to solve the selected task.
arXiv Detail & Related papers (2022-10-31T14:45:39Z) - Reinforcement Teaching [40.231724440690776]
We develop a unifying meta-learning framework, called Reinforcement Teaching, to improve the learning process of machine learning algorithms.
Under Reinforcement Teaching, a teaching policy is learned, through reinforcement, to improve a student's learning algorithm.
To demonstrate the generality of Reinforcement Teaching, we conduct experiments in which a teacher learns to significantly improve both reinforcement and supervised learning algorithms.
arXiv Detail & Related papers (2022-04-25T18:04:17Z) - Iterative Teacher-Aware Learning [136.05341445369265]
In human pedagogy, teachers and students can interact adaptively to maximize communication efficiency.
We propose a gradient optimization based teacher-aware learner who can incorporate teacher's cooperative intention into the likelihood function.
arXiv Detail & Related papers (2021-10-01T00:27:47Z) - Learning by Teaching, with Application to Neural Architecture Search [10.426533624387305]
We propose a novel ML framework referred to as learning by teaching (LBT)
In LBT, a teacher model improves itself by teaching a student model to learn well.
Based on how the student performs on a validation dataset, the teacher re-learns its model and re-teaches the student until the student achieves great validation performance.
arXiv Detail & Related papers (2021-03-11T23:50:38Z) - Emergent Real-World Robotic Skills via Unsupervised Off-Policy
Reinforcement Learning [81.12201426668894]
We develop efficient reinforcement learning methods that acquire diverse skills without any reward function, and then repurpose these skills for downstream tasks.
We show that our proposed algorithm provides substantial improvement in learning efficiency, making reward-free real-world training feasible.
We also demonstrate that the learned skills can be composed using model predictive control for goal-oriented navigation, without any additional training.
arXiv Detail & Related papers (2020-04-27T17:38:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.