Distribution Matching for Machine Teaching
- URL: http://arxiv.org/abs/2105.13809v1
- Date: Thu, 6 May 2021 09:32:57 GMT
- Title: Distribution Matching for Machine Teaching
- Authors: Xiaofeng Cao and Ivor W. Tsang
- Abstract summary: Machine teaching is an inverse problem of machine learning that aims at steering the student learner towards its target hypothesis.
Previous studies on machine teaching focused on balancing the teaching risk and cost to find those best teaching examples.
This paper presents a distribution matching-based machine teaching strategy.
- Score: 64.39292542263286
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine teaching is an inverse problem of machine learning that aims at
steering the student learner towards its target hypothesis, in which the
teacher has already known the student's learning parameters. Previous studies
on machine teaching focused on balancing the teaching risk and cost to find
those best teaching examples deriving the student model. This optimization
solver is in general ineffective when the student learner does not disclose any
cue of the learning parameters. To supervise such a teaching scenario, this
paper presents a distribution matching-based machine teaching strategy.
Specifically, this strategy backwardly and iteratively performs the halving
operation on the teaching cost to find a desired teaching set. Technically, our
strategy can be expressed as a cost-controlled optimization process that finds
the optimal teaching examples without further exploring in the parameter
distribution of the student learner. Then, given any a limited teaching cost,
the training examples will be closed-form. Theoretical analysis and experiment
results demonstrate this strategy.
Related papers
- Toward In-Context Teaching: Adapting Examples to Students' Misconceptions [54.82965010592045]
We introduce a suite of models and evaluation methods we call AdapT.
AToM is a new probabilistic model for adaptive teaching that jointly infers students' past beliefs and optimize for the correctness of future beliefs.
Our results highlight both the difficulty of the adaptive teaching task and the potential of learned adaptive models for solving it.
arXiv Detail & Related papers (2024-05-07T17:05:27Z) - One-shot Machine Teaching: Cost Very Few Examples to Converge Faster [45.96956111867065]
We consider a more intelligent teaching paradigm named one-shot machine teaching.
It establishes a tractable mapping from the teaching set to the model parameter.
We prove that this mapping is surjective, which serves to an existence guarantee of the optimal teaching set.
arXiv Detail & Related papers (2022-12-13T07:51:17Z) - Iterative Teacher-Aware Learning [136.05341445369265]
In human pedagogy, teachers and students can interact adaptively to maximize communication efficiency.
We propose a gradient optimization based teacher-aware learner who can incorporate teacher's cooperative intention into the likelihood function.
arXiv Detail & Related papers (2021-10-01T00:27:47Z) - Teaching to Learn: Sequential Teaching of Agents with Inner States [20.556373950863247]
We introduce a multi-agent formulation in which learners' inner state may change with the teaching interaction.
In order to teach such learners, we propose an optimal control approach that takes the future performance of the learner after teaching into account.
arXiv Detail & Related papers (2020-09-14T07:03:15Z) - Iterative Machine Teaching without Teachers [12.239246363539634]
Existing studies on iterative machine teaching assume that there are teachers who know the true answers of all teaching examples.
In this study, we consider an unsupervised case where such teachers do not exist.
Students are given a teaching example at each iteration, but there is no guarantee if the corresponding label is correct.
arXiv Detail & Related papers (2020-06-27T11:21:57Z) - The Sample Complexity of Teaching-by-Reinforcement on Q-Learning [40.37954633873304]
We study the sample complexity of teaching, termed as "teaching dimension" (TDim) in the literature, for the teaching-by-reinforcement paradigm.
In this paper, we focus on a specific family of reinforcement learning algorithms, Q-learning, and characterize the TDim under different teachers with varying control power over the environment.
Our TDim results provide the minimum number of samples needed for reinforcement learning, and we discuss their connections to standard PAC-style RL sample complexity and teaching-by-demonstration sample complexity results.
arXiv Detail & Related papers (2020-06-16T17:06:04Z) - Dual Policy Distillation [58.43610940026261]
Policy distillation, which transfers a teacher policy to a student policy, has achieved great success in challenging tasks of deep reinforcement learning.
In this work, we introduce dual policy distillation(DPD), a student-student framework in which two learners operate on the same environment to explore different perspectives of the environment.
The key challenge in developing this dual learning framework is to identify the beneficial knowledge from the peer learner for contemporary learning-based reinforcement learning algorithms.
arXiv Detail & Related papers (2020-06-07T06:49:47Z) - Provable Representation Learning for Imitation Learning via Bi-level
Optimization [60.059520774789654]
A common strategy in modern learning systems is to learn a representation that is useful for many tasks.
We study this strategy in the imitation learning setting for Markov decision processes (MDPs) where multiple experts' trajectories are available.
We instantiate this framework for the imitation learning settings of behavior cloning and observation-alone.
arXiv Detail & Related papers (2020-02-24T21:03:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.