Alice: Proactive Learning with Teacher's Demonstrations for Weak-to-Strong Generalization
- URL: http://arxiv.org/abs/2504.07316v2
- Date: Fri, 11 Apr 2025 20:13:23 GMT
- Title: Alice: Proactive Learning with Teacher's Demonstrations for Weak-to-Strong Generalization
- Authors: Shujin Wu, Cheng Qian, Yi R. Fung, Paul Pu Liang, Heng Ji,
- Abstract summary: Weak-to-strong generalization (W2SG) offers a promising framework for supervising increasingly capable language models (LLMs)<n>Traditional W2SG methods rely on passive learning, where a weak teacher provides noisy demonstrations to train a strong student.<n>We introduce Alice, a framework that leverages complementary knowledge between teacher and student to enhance the learning process.
- Score: 69.96794098855938
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The growing capabilities of large language models (LLMs) present a key challenge of maintaining effective human oversight. Weak-to-strong generalization (W2SG) offers a promising framework for supervising increasingly capable LLMs using weaker ones. Traditional W2SG methods rely on passive learning, where a weak teacher provides noisy demonstrations to train a strong student. This hinders students from employing their knowledge during training and reaching their full potential. In this work, we introduce Alice (pro{A}ctive {l}earning w{i}th tea{c}her's D{e}monstrations), a framework that leverages complementary knowledge between teacher and student to enhance the learning process. We probe the knowledge base of the teacher model by eliciting their uncertainty, and then use these insights together with teachers' responses as demonstrations to guide student models in self-generating improved responses for supervision. In addition, for situations with significant capability gaps between teacher and student models, we introduce cascade Alice, which employs a hierarchical training approach where weak teachers initially supervise intermediate models, who then guide stronger models in sequence. Experimental results demonstrate that our method significantly enhances the W2SG performance, yielding substantial improvements in three key tasks compared to the original W2SG: knowledge-based reasoning (+4.0%), mathematical reasoning (+22.62%), and logical reasoning (+12.11%). This highlights the effectiveness of our new W2SG paradigm that enables more robust knowledge transfer and supervision outcome.
Related papers
- Teaching-Assistant-in-the-Loop: Improving Knowledge Distillation from Imperfect Teacher Models in Low-Budget Scenarios [3.818273633647809]
We propose a three-component framework leveraging three signal types.
The first signal is the student's self-consistency (consistency of student multiple outputs), which is a proxy of the student's confidence.
We show that our proposed two-stage framework brings a relative improvement of up to 20.79% compared to fine-tuning without any signals across datasets.
arXiv Detail & Related papers (2024-06-08T02:17:43Z) - Bayesian WeakS-to-Strong from Text Classification to Generation [14.897191979004782]
This work extends Weak-to-Strong to WeakS-to-Strong by exploring an ensemble of weak models which simulate the variability in human opinions.
Confidence scores are estimated using a Bayesian approach to guide the WeakS-to-Strong generalization.
Results demonstrate the effectiveness of the proposed approach for the reliability of a strong student model, showing potential for superalignment.
arXiv Detail & Related papers (2024-05-24T13:33:11Z) - Co-Supervised Learning: Improving Weak-to-Strong Generalization with
Hierarchical Mixture of Experts [81.37287967870589]
We propose to harness a diverse set of specialized teachers, instead of a single generalist one, that collectively supervises the strong student.
Our approach resembles the classical hierarchical mixture of experts, with two components tailored for co-supervision.
We validate the proposed method through visual recognition tasks on the OpenAI weak-to-strong benchmark and additional multi-domain datasets.
arXiv Detail & Related papers (2024-02-23T18:56:11Z) - Improving Weak-to-Strong Generalization with Scalable Oversight and
Ensemble Learning [21.401598876308345]
This paper presents a follow-up study to OpenAI's recent superalignment work on Weak-to-Strong Generalization (W2SG)
Superalignment focuses on ensuring that high-level AI systems remain consistent with human values and intentions when dealing with complex, high-risk tasks.
Our study simulates two phases of superalignment under the W2SG framework: the development of general superhuman models and the progression towards superintelligence.
arXiv Detail & Related papers (2024-02-01T15:30:19Z) - YODA: Teacher-Student Progressive Learning for Language Models [82.0172215948963]
This paper introduces YODA, a teacher-student progressive learning framework.
It emulates the teacher-student education process to improve the efficacy of model fine-tuning.
Experiments show that training LLaMA2 with data from YODA improves SFT with significant performance gain.
arXiv Detail & Related papers (2024-01-28T14:32:15Z) - Distantly-Supervised Named Entity Recognition with Adaptive Teacher
Learning and Fine-grained Student Ensemble [56.705249154629264]
Self-training teacher-student frameworks are proposed to improve the robustness of NER models.
In this paper, we propose an adaptive teacher learning comprised of two teacher-student networks.
Fine-grained student ensemble updates each fragment of the teacher model with a temporal moving average of the corresponding fragment of the student, which enhances consistent predictions on each model fragment against noise.
arXiv Detail & Related papers (2022-12-13T12:14:09Z) - Better Teacher Better Student: Dynamic Prior Knowledge for Knowledge
Distillation [70.92135839545314]
We propose the dynamic prior knowledge (DPK), which integrates part of teacher's features as the prior knowledge before the feature distillation.
Our DPK makes the performance of the student model positively correlated with that of the teacher model, which means that we can further boost the accuracy of students by applying larger teachers.
arXiv Detail & Related papers (2022-06-13T11:52:13Z) - Dual Policy Distillation [58.43610940026261]
Policy distillation, which transfers a teacher policy to a student policy, has achieved great success in challenging tasks of deep reinforcement learning.
In this work, we introduce dual policy distillation(DPD), a student-student framework in which two learners operate on the same environment to explore different perspectives of the environment.
The key challenge in developing this dual learning framework is to identify the beneficial knowledge from the peer learner for contemporary learning-based reinforcement learning algorithms.
arXiv Detail & Related papers (2020-06-07T06:49:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.