WST: Weak-to-Strong Knowledge Transfer via Reinforcement Learning
- URL: http://arxiv.org/abs/2508.16741v1
- Date: Fri, 22 Aug 2025 18:33:06 GMT
- Title: WST: Weak-to-Strong Knowledge Transfer via Reinforcement Learning
- Authors: Haosen Ge, Shuo Li, Lianghuan Huang,
- Abstract summary: We introduce Weak-to-Strong Transfer (WST), an automatic prompt engineering framework where a small "Teacher" model generates instructions that enhance the performance of a much larger "Student" model.<n>Using reinforcement learning, the Teacher Model's instructions are iteratively improved based on the Student Model's outcomes.<n>These results demonstrate that small models can reliably scaffold larger ones, unlocking latent capabilities while avoiding misleading prompts that stronger teachers may introduce.
- Score: 4.795603034052733
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Effective prompt engineering remains a challenging task for many applications. We introduce Weak-to-Strong Transfer (WST), an automatic prompt engineering framework where a small "Teacher" model generates instructions that enhance the performance of a much larger "Student" model. Unlike prior work, WST requires only a weak teacher, making it efficient and broadly applicable in settings where large models are closed-source or difficult to fine-tune. Using reinforcement learning, the Teacher Model's instructions are iteratively improved based on the Student Model's outcomes, yielding substantial gains across reasoning (MATH-500, GSM8K) and alignment (HH-RLHF) benchmarks - 98% on MATH-500 and 134% on HH-RLHF - and surpassing baselines such as GPT-4o-mini and Llama-70B. These results demonstrate that small models can reliably scaffold larger ones, unlocking latent capabilities while avoiding misleading prompts that stronger teachers may introduce, establishing WST as a scalable solution for efficient and safe LLM prompt refinement.
Related papers
- Model Whisper: Steering Vectors Unlock Large Language Models' Potential in Test-time [6.741914038966904]
We introduce a lightweight component, Test-Time Steering Vectors (TTSV), which is prepended to the input while keeping the model's parameters entirely frozen.<n>TTSV is both lightweight and highly efficient to optimize, making it a true plug-and-play enhancement.<n>Our approach exhibits robust generalization, with its steering vectors proving highly transferable across diverse tasks.
arXiv Detail & Related papers (2025-12-04T12:36:16Z) - Learning to Rank Chain-of-Thought: Using a Small Model [77.75522308463667]
This paper introduces the Energy Outcome Reward Model (EORM), a highly efficient, lightweight verifier designed to address this challenge.<n>EORM uses an energy-based framework to rank Chain-of-Thought (CoT) solutions, learning to distinguish correct from incorrect reasoning using only simple outcome labels.<n>With only 55M parameters, over 127 times smaller than typical reward models, EORM boosts the accuracy of Llama 3 8B to 90.7% on GSM8k and 63.7% on MATH.
arXiv Detail & Related papers (2025-05-21T01:06:29Z) - Enhancing Knowledge Distillation for LLMs with Response-Priming Prompting [1.9461727843485295]
We propose a set of novel response-priming prompting strategies to enhance the performance of student models.<n>Our approach fine-tunes a smaller Llama 3.1 8B Instruct model by distilling knowledge from a quantized Llama 3.1 405B Instruct teacher model.<n>We find that Ground Truth prompting results in a 55% performance increase on GSM8K for a distilled Llama 3.1 8B Instruct.
arXiv Detail & Related papers (2024-12-18T20:41:44Z) - Stronger Models are NOT Stronger Teachers for Instruction Tuning [12.87887398974395]
We show that larger and stronger models are not necessarily stronger teachers of smaller models.<n>We thus develop a novel metric, named as Compatibility-Adjusted Reward (CAR) to measure the effectiveness of response generators.
arXiv Detail & Related papers (2024-11-11T17:06:48Z) - Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities.
In-Context Learning (ICL) and.
Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting.
LLMs to downstream tasks.
We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z) - Exploring and Enhancing the Transfer of Distribution in Knowledge Distillation for Autoregressive Language Models [62.5501109475725]
Knowledge distillation (KD) is a technique that compresses large teacher models by training smaller student models to mimic them.
This paper introduces Online Knowledge Distillation (OKD), where the teacher network integrates small online modules to concurrently train with the student model.
OKD achieves or exceeds the performance of leading methods in various model architectures and sizes, reducing training time by up to fourfold.
arXiv Detail & Related papers (2024-09-19T07:05:26Z) - GKT: A Novel Guidance-Based Knowledge Transfer Framework For Efficient Cloud-edge Collaboration LLM Deployment [74.40196814292426]
We introduce a novel and intuitive Guidance-based Knowledge Transfer (GKT) framework.
GKT uses a larger Large Language Models as a ''teacher'' to create guidance prompts, paired with a smaller ''student'' model to finalize responses.
It achieves a maximum accuracy improvement of 14.18%, along with a 10.72 times speed-up on GSM8K and an accuracy improvement of 14.00 % along with a 7.73 times speed-up in CSQA.
arXiv Detail & Related papers (2024-05-30T02:37:35Z) - Retrieval-based Knowledge Transfer: An Effective Approach for Extreme
Large Language Model Compression [64.07696663255155]
Large-scale pre-trained language models (LLMs) have demonstrated exceptional performance in various natural language processing (NLP) tasks.
However, the massive size of these models poses huge challenges for their deployment in real-world applications.
We introduce a novel compression paradigm called Retrieval-based Knowledge Transfer (RetriKT) which effectively transfers the knowledge of LLMs to extremely small-scale models.
arXiv Detail & Related papers (2023-10-24T07:58:20Z) - Evaluating the Robustness to Instructions of Large Language Models [6.947956990248856]
Fine-tuning Large Language Models (LLMs) can boost their zero-shot capabilities on novel tasks.
We evaluate six models including Alpaca, Vicuna, WizardLM, and Traditional Task-oriented Models(Flan-T5-XL/XXL, T0++)
We find that the robustness of different scales of FLAN-T5 models to RE instruction is worse than the robustness to QA instruction.
arXiv Detail & Related papers (2023-08-28T04:57:07Z) - Lion: Adversarial Distillation of Proprietary Large Language Models [16.245052771463044]
We propose a novel adversarial distillation framework for a more efficient knowledge transfer.
We successfully transfer knowledge from ChatGPT to a student model (named Lion) using a mere 70k training data.
arXiv Detail & Related papers (2023-05-22T09:49:16Z) - LiST: Lite Self-training Makes Efficient Few-shot Learners [91.28065455714018]
LiST improves by 35% over classic fine-tuning methods and 6% over prompt-tuning with 96% reduction in number of trainable parameters when fine-tuned with no more than 30 labeled examples from each target domain.
arXiv Detail & Related papers (2021-10-12T18:47:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.