Interaction-limited Inverse Reinforcement Learning
- URL: http://arxiv.org/abs/2007.00425v1
- Date: Wed, 1 Jul 2020 12:31:52 GMT
- Title: Interaction-limited Inverse Reinforcement Learning
- Authors: Martin Troussard, Emmanuel Pignat, Parameswaran Kamalaruban, Sylvain
Calinon, Volkan Cevher
- Abstract summary: We present two different training strategies: Curriculum Inverse Reinforcement Learning (CIRL) covering the teacher's perspective, and Self-Paced Inverse Reinforcement Learning (SPIRL) focusing on the learner's perspective.
Using experiments in simulations and experiments with a real robot learning a task from a human demonstrator, we show that our training strategies can allow a faster training than a random teacher for CIRL and than a batch learner for SPIRL.
- Score: 50.201765937436654
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes an inverse reinforcement learning (IRL) framework to
accelerate learning when the learner-teacher \textit{interaction} is
\textit{limited} during training. Our setting is motivated by the realistic
scenarios where a helpful teacher is not available or when the teacher cannot
access the learning dynamics of the student. We present two different training
strategies: Curriculum Inverse Reinforcement Learning (CIRL) covering the
teacher's perspective, and Self-Paced Inverse Reinforcement Learning (SPIRL)
focusing on the learner's perspective. Using experiments in simulations and
experiments with a real robot learning a task from a human demonstrator, we
show that our training strategies can allow a faster training than a random
teacher for CIRL and than a batch learner for SPIRL.
Related papers
- SPIRE: Synergistic Planning, Imitation, and Reinforcement Learning for Long-Horizon Manipulation [58.14969377419633]
We propose spire, a system that decomposes tasks into smaller learning subproblems and second combines imitation and reinforcement learning to maximize their strengths.
We find that spire outperforms prior approaches that integrate imitation learning, reinforcement learning, and planning by 35% to 50% in average task performance.
arXiv Detail & Related papers (2024-10-23T17:42:07Z) - Learn to Teach: Improve Sample Efficiency in Teacher-student Learning
for Sim-to-Real Transfer [5.731477362725785]
We propose a sample efficient learning framework termed Learn to Teach (L2T) that recycles experience collected by the teacher agent.
We show that a single-loop algorithm can train both the teacher and student agents under both Reinforcement Learning and Inverse Reinforcement Learning contexts.
arXiv Detail & Related papers (2024-02-09T21:16:43Z) - YODA: Teacher-Student Progressive Learning for Language Models [82.0172215948963]
This paper introduces YODA, a teacher-student progressive learning framework.
It emulates the teacher-student education process to improve the efficacy of model fine-tuning.
Experiments show that training LLaMA2 with data from YODA improves SFT with significant performance gain.
arXiv Detail & Related papers (2024-01-28T14:32:15Z) - Teacher-student curriculum learning for reinforcement learning [1.7259824817932292]
Reinforcement learning (rl) is a popular paradigm for sequential decision making problems.
The sample inefficiency of deep reinforcement learning methods is a significant obstacle when applying rl to real-world problems.
We propose a teacher-student curriculum learning setting where we simultaneously train a teacher that selects tasks for the student while the student learns how to solve the selected task.
arXiv Detail & Related papers (2022-10-31T14:45:39Z) - Reinforcement Teaching [43.80089037901853]
We propose Reinforcement Teaching: a framework for meta-learning in which a teaching policy is learned, through reinforcement, to control a student's learning process.
The student's learning process is modelled as a Markov reward process and the teacher, with its action-space, interacts with the induced Markov decision process.
We show that, for many learning processes, the student's learnable parameters form a Markov state. To avoid having the teacher learn directly from parameters, we propose the Embedder that learns a representation of a student's state from its input/output behaviour.
arXiv Detail & Related papers (2022-04-25T18:04:17Z) - Iterative Teacher-Aware Learning [136.05341445369265]
In human pedagogy, teachers and students can interact adaptively to maximize communication efficiency.
We propose a gradient optimization based teacher-aware learner who can incorporate teacher's cooperative intention into the likelihood function.
arXiv Detail & Related papers (2021-10-01T00:27:47Z) - Learning by Teaching, with Application to Neural Architecture Search [10.426533624387305]
We propose a novel ML framework referred to as learning by teaching (LBT)
In LBT, a teacher model improves itself by teaching a student model to learn well.
Based on how the student performs on a validation dataset, the teacher re-learns its model and re-teaches the student until the student achieves great validation performance.
arXiv Detail & Related papers (2021-03-11T23:50:38Z) - Emergent Real-World Robotic Skills via Unsupervised Off-Policy
Reinforcement Learning [81.12201426668894]
We develop efficient reinforcement learning methods that acquire diverse skills without any reward function, and then repurpose these skills for downstream tasks.
We show that our proposed algorithm provides substantial improvement in learning efficiency, making reward-free real-world training feasible.
We also demonstrate that the learned skills can be composed using model predictive control for goal-oriented navigation, without any additional training.
arXiv Detail & Related papers (2020-04-27T17:38:53Z) - Learning from Learners: Adapting Reinforcement Learning Agents to be
Competitive in a Card Game [71.24825724518847]
We present a study on how popular reinforcement learning algorithms can be adapted to learn and to play a real-world implementation of a competitive multiplayer card game.
We propose specific training and validation routines for the learning agents, in order to evaluate how the agents learn to be competitive and explain how they adapt to each others' playing style.
arXiv Detail & Related papers (2020-04-08T14:11:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.