On the Mechanisms of Weak-to-Strong Generalization: A Theoretical Perspective
- URL: http://arxiv.org/abs/2505.18346v1
- Date: Fri, 23 May 2025 20:09:09 GMT
- Title: On the Mechanisms of Weak-to-Strong Generalization: A Theoretical Perspective
- Authors: Behrad Moniri, Hamed Hassani,
- Abstract summary: Weak-to-strong generalization, where a student model trained on imperfect labels surpasses that teacher, has been widely observed.<n>In this paper, through a theoretical analysis of simple models, we uncover three core mechanisms that can drive this phenomenon.
- Score: 28.005935031887038
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Weak-to-strong generalization, where a student model trained on imperfect labels generated by a weaker teacher nonetheless surpasses that teacher, has been widely observed but the mechanisms that enable it have remained poorly understood. In this paper, through a theoretical analysis of simple models, we uncover three core mechanisms that can drive this phenomenon. First, by analyzing ridge regression, we study the interplay between the teacher and student regularization and prove that a student can compensate for a teacher's under-regularization and achieve lower test error. We also analyze the role of the parameterization regime of the models. Second, by analyzing weighted ridge regression, we show that a student model with a regularization structure more aligned to the target, can outperform its teacher. Third, in a nonlinear multi-index setting, we demonstrate that a student can learn easy, task-specific features from the teacher while leveraging its own broader pre-training to learn hard-to-learn features that the teacher cannot capture.
Related papers
- On the Emergence of Weak-to-Strong Generalization: A Bias-Variance Perspective [14.65315912348303]
Weak-to-strong generalization (W2SG) refers to the phenomenon where a strong student model, trained on a dataset labeled by a weak teacher, outperforms the teacher on the target task.<n>Recent studies attribute this performance gain to the prediction misfit between the student and teacher models.<n>We show that W2SG is more likely to emerge when the student model approximates its posterior mean teacher, rather than mimicking an individual teacher.
arXiv Detail & Related papers (2025-05-30T07:52:43Z) - Alice: Proactive Learning with Teacher's Demonstrations for Weak-to-Strong Generalization [69.96794098855938]
Weak-to-strong generalization (W2SG) offers a promising framework for supervising increasingly capable language models (LLMs)<n>Traditional W2SG methods rely on passive learning, where a weak teacher provides noisy demonstrations to train a strong student.<n>We introduce Alice, a framework that leverages complementary knowledge between teacher and student to enhance the learning process.
arXiv Detail & Related papers (2025-04-09T22:33:06Z) - Understanding the Capabilities and Limitations of Weak-to-Strong Generalization [40.793180521446466]
We provide theoretical insights into weak-to-strong generalization.<n>We show that the weak model should demonstrate strong generalization performance and maintain well-calibrated predictions.<n>We extend the work of Charikar et al. (2024) to a loss function based on Kullback-Leibler divergence.
arXiv Detail & Related papers (2025-02-03T15:48:28Z) - Theoretical Analysis of Weak-to-Strong Generalization [23.235671743867492]
We show that existing weak supervision theory fails to account for pseudolabel correction and coverage expansion.
Our bounds capture the intuition that weak-to-strong generalization occurs when the strong model is unable to fit the mistakes of the weak teacher without incurring additional error.
arXiv Detail & Related papers (2024-05-25T03:48:12Z) - Co-Supervised Learning: Improving Weak-to-Strong Generalization with
Hierarchical Mixture of Experts [81.37287967870589]
We propose to harness a diverse set of specialized teachers, instead of a single generalist one, that collectively supervises the strong student.
Our approach resembles the classical hierarchical mixture of experts, with two components tailored for co-supervision.
We validate the proposed method through visual recognition tasks on the OpenAI weak-to-strong benchmark and additional multi-domain datasets.
arXiv Detail & Related papers (2024-02-23T18:56:11Z) - On student-teacher deviations in distillation: does it pay to disobey? [54.908344098305804]
Knowledge distillation has been widely used to improve the test accuracy of a "student" network.
Despite being trained to fit the teacher's probabilities, the student may not only significantly deviate from the teacher probabilities, but may also outdo the teacher in performance.
arXiv Detail & Related papers (2023-01-30T14:25:02Z) - Supervision Complexity and its Role in Knowledge Distillation [65.07910515406209]
We study the generalization behavior of a distilled student.
The framework highlights a delicate interplay among the teacher's accuracy, the student's margin with respect to the teacher predictions, and the complexity of the teacher predictions.
We demonstrate efficacy of online distillation and validate the theoretical findings on a range of image classification benchmarks and model architectures.
arXiv Detail & Related papers (2023-01-28T16:34:47Z) - Distantly-Supervised Named Entity Recognition with Adaptive Teacher
Learning and Fine-grained Student Ensemble [56.705249154629264]
Self-training teacher-student frameworks are proposed to improve the robustness of NER models.
In this paper, we propose an adaptive teacher learning comprised of two teacher-student networks.
Fine-grained student ensemble updates each fragment of the teacher model with a temporal moving average of the corresponding fragment of the student, which enhances consistent predictions on each model fragment against noise.
arXiv Detail & Related papers (2022-12-13T12:14:09Z) - From Mimicking to Integrating: Knowledge Integration for Pre-Trained
Language Models [55.137869702763375]
This paper explores a novel PLM reuse paradigm, Knowledge Integration (KI)
KI aims to merge the knowledge from different teacher-PLMs, each of which specializes in a different classification problem, into a versatile student model.
We then design a Model Uncertainty--aware Knowledge Integration (MUKI) framework to recover the golden supervision for the student.
arXiv Detail & Related papers (2022-10-11T07:59:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.