YODA: Teacher-Student Progressive Learning for Language Models
- URL: http://arxiv.org/abs/2401.15670v1
- Date: Sun, 28 Jan 2024 14:32:15 GMT
- Title: YODA: Teacher-Student Progressive Learning for Language Models
- Authors: Jianqiao Lu, Wanjun Zhong, Yufei Wang, Zhijiang Guo, Qi Zhu, Wenyong
Huang, Yanlin Wang, Fei Mi, Baojun Wang, Yasheng Wang, Lifeng Shang, Xin
Jiang, Qun Liu
- Abstract summary: This paper introduces YODA, a teacher-student progressive learning framework.
It emulates the teacher-student education process to improve the efficacy of model fine-tuning.
Experiments show that training LLaMA2 with data from YODA improves SFT with significant performance gain.
- Score: 82.0172215948963
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although large language models (LLMs) have demonstrated adeptness in a range
of tasks, they still lag behind human learning efficiency. This disparity is
often linked to the inherent human capacity to learn from basic examples,
gradually generalize and handle more complex problems, and refine their skills
with continuous feedback. Inspired by this, this paper introduces YODA, a novel
teacher-student progressive learning framework that emulates the
teacher-student education process to improve the efficacy of model fine-tuning.
The framework operates on an interactive \textit{basic-generalized-harder}
loop. The teacher agent provides tailored feedback on the student's answers,
and systematically organizes the education process. This process unfolds by
teaching the student basic examples, reinforcing understanding through
generalized questions, and then enhancing learning by posing questions with
progressively enhanced complexity. With the teacher's guidance, the student
learns to iteratively refine its answer with feedback, and forms a robust and
comprehensive understanding of the posed questions. The systematic procedural
data, which reflects the progressive learning process of humans, is then
utilized for model training. Taking math reasoning as a testbed, experiments
show that training LLaMA2 with data from YODA improves SFT with significant
performance gain (+17.01\% on GSM8K and +9.98\% on MATH). In addition, we find
that training with curriculum learning further improves learning robustness.
Related papers
- When Babies Teach Babies: Can student knowledge sharing outperform Teacher-Guided Distillation on small datasets? [0.0]
We present our submission to the BabyLM challenge, aiming to push the boundaries of data-efficient language model pretraining.
We address the limitation of treating students equally by formulating weighted mutual learning as a bi-level optimization problem.
Our evaluations show that teacher-less methods can match or surpass teacher-supervised approaches.
arXiv Detail & Related papers (2024-11-25T15:25:31Z) - Toward In-Context Teaching: Adapting Examples to Students' Misconceptions [54.82965010592045]
We introduce a suite of models and evaluation methods we call AdapT.
AToM is a new probabilistic model for adaptive teaching that jointly infers students' past beliefs and optimize for the correctness of future beliefs.
Our results highlight both the difficulty of the adaptive teaching task and the potential of learned adaptive models for solving it.
arXiv Detail & Related papers (2024-05-07T17:05:27Z) - Revealing Networks: Understanding Effective Teacher Practices in
AI-Supported Classrooms using Transmodal Ordered Network Analysis [0.9187505256430948]
The present study uses transmodal ordered network analysis to understand effective teacher practices in relationship to traditional metrics of in-system learning in a mathematics classroom working with AI tutors.
Comparing teacher practices by student learning rates, we find that students with low learning rates exhibited more hint use after monitoring.
Students with low learning rates showed learning behavior similar to their high learning rate peers, achieving repeated correct attempts in the tutor.
arXiv Detail & Related papers (2023-12-17T21:50:02Z) - Empowering Private Tutoring by Chaining Large Language Models [87.76985829144834]
This work explores the development of a full-fledged intelligent tutoring system powered by state-of-the-art large language models (LLMs)
The system is into three inter-connected core processes-interaction, reflection, and reaction.
Each process is implemented by chaining LLM-powered tools along with dynamically updated memory modules.
arXiv Detail & Related papers (2023-09-15T02:42:03Z) - Teacher-student curriculum learning for reinforcement learning [1.7259824817932292]
Reinforcement learning (rl) is a popular paradigm for sequential decision making problems.
The sample inefficiency of deep reinforcement learning methods is a significant obstacle when applying rl to real-world problems.
We propose a teacher-student curriculum learning setting where we simultaneously train a teacher that selects tasks for the student while the student learns how to solve the selected task.
arXiv Detail & Related papers (2022-10-31T14:45:39Z) - Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis [87.75833205560406]
This work presents a lifelong learning approach to train a multilingual Text-To-Speech (TTS) system.
It does not require pooled data from all languages altogether, and thus alleviates the storage and computation burden.
arXiv Detail & Related papers (2021-10-09T07:00:38Z) - Iterative Teacher-Aware Learning [136.05341445369265]
In human pedagogy, teachers and students can interact adaptively to maximize communication efficiency.
We propose a gradient optimization based teacher-aware learner who can incorporate teacher's cooperative intention into the likelihood function.
arXiv Detail & Related papers (2021-10-01T00:27:47Z) - Learning by Teaching, with Application to Neural Architecture Search [10.426533624387305]
We propose a novel ML framework referred to as learning by teaching (LBT)
In LBT, a teacher model improves itself by teaching a student model to learn well.
Based on how the student performs on a validation dataset, the teacher re-learns its model and re-teaches the student until the student achieves great validation performance.
arXiv Detail & Related papers (2021-03-11T23:50:38Z) - Learning to Reweight with Deep Interactions [104.68509759134878]
We propose an improved data reweighting algorithm, in which the student model provides its internal states to the teacher model.
Experiments on image classification with clean/noisy labels and neural machine translation empirically demonstrate that our algorithm makes significant improvement over previous methods.
arXiv Detail & Related papers (2020-07-09T09:06:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.