Reinforcement Teaching
- URL: http://arxiv.org/abs/2204.11897v1
- Date: Mon, 25 Apr 2022 18:04:17 GMT
- Title: Reinforcement Teaching
- Authors: Alex Lewandowski, Calarina Muslimani, Matthew E. Taylor, Jun Luo, Dale
Schuurmans
- Abstract summary: We propose Reinforcement Teaching: a framework for meta-learning in which a teaching policy is learned, through reinforcement, to control a student's learning process.
The student's learning process is modelled as a Markov reward process and the teacher, with its action-space, interacts with the induced Markov decision process.
We show that, for many learning processes, the student's learnable parameters form a Markov state. To avoid having the teacher learn directly from parameters, we propose the Embedder that learns a representation of a student's state from its input/output behaviour.
- Score: 43.80089037901853
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose Reinforcement Teaching: a framework for meta-learning in which a
teaching policy is learned, through reinforcement, to control a student's
learning process. The student's learning process is modelled as a Markov reward
process and the teacher, with its action-space, interacts with the induced
Markov decision process. We show that, for many learning processes, the
student's learnable parameters form a Markov state. To avoid having the teacher
learn directly from parameters, we propose the Parameter Embedder that learns a
representation of a student's state from its input/output behaviour. Next, we
use learning progress to shape the teacher's reward towards maximizing the
student's performance. To demonstrate the generality of Reinforcement Teaching,
we conducted experiments in which a teacher learns to significantly improve
supervised and reinforcement learners by using a combination of learning
progress reward and a Parameter Embedded state. These results show that
Reinforcement Teaching is not only an expressive framework capable of unifying
different approaches, but also provides meta-learning with the plethora of
tools from reinforcement learning.
Related papers
- YODA: Teacher-Student Progressive Learning for Language Models [82.0172215948963]
This paper introduces YODA, a teacher-student progressive learning framework.
It emulates the teacher-student education process to improve the efficacy of model fine-tuning.
Experiments show that training LLaMA2 with data from YODA improves SFT with significant performance gain.
arXiv Detail & Related papers (2024-01-28T14:32:15Z) - A Machine Learning system to monitor student progress in educational
institutes [0.0]
We propose a data driven approach that makes use of Machine Learning techniques to generate a classifier called credit score.
The proposal to use credit score as progress indicator is well suited to be used in a Learning Management System.
arXiv Detail & Related papers (2022-11-02T08:24:08Z) - Iterative Teacher-Aware Learning [136.05341445369265]
In human pedagogy, teachers and students can interact adaptively to maximize communication efficiency.
We propose a gradient optimization based teacher-aware learner who can incorporate teacher's cooperative intention into the likelihood function.
arXiv Detail & Related papers (2021-10-01T00:27:47Z) - Distribution Matching for Machine Teaching [64.39292542263286]
Machine teaching is an inverse problem of machine learning that aims at steering the student learner towards its target hypothesis.
Previous studies on machine teaching focused on balancing the teaching risk and cost to find those best teaching examples.
This paper presents a distribution matching-based machine teaching strategy.
arXiv Detail & Related papers (2021-05-06T09:32:57Z) - Rethinking Supervised Learning and Reinforcement Learning in
Task-Oriented Dialogue Systems [58.724629408229205]
We demonstrate how traditional supervised learning and a simulator-free adversarial learning method can be used to achieve performance comparable to state-of-the-art RL-based methods.
Our main goal is not to beat reinforcement learning with supervised learning, but to demonstrate the value of rethinking the role of reinforcement learning and supervised learning in optimizing task-oriented dialogue systems.
arXiv Detail & Related papers (2020-09-21T12:04:18Z) - Teaching to Learn: Sequential Teaching of Agents with Inner States [20.556373950863247]
We introduce a multi-agent formulation in which learners' inner state may change with the teaching interaction.
In order to teach such learners, we propose an optimal control approach that takes the future performance of the learner after teaching into account.
arXiv Detail & Related papers (2020-09-14T07:03:15Z) - Interaction-limited Inverse Reinforcement Learning [50.201765937436654]
We present two different training strategies: Curriculum Inverse Reinforcement Learning (CIRL) covering the teacher's perspective, and Self-Paced Inverse Reinforcement Learning (SPIRL) focusing on the learner's perspective.
Using experiments in simulations and experiments with a real robot learning a task from a human demonstrator, we show that our training strategies can allow a faster training than a random teacher for CIRL and than a batch learner for SPIRL.
arXiv Detail & Related papers (2020-07-01T12:31:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.