Dynamic Corrective Self-Distillation for Better Fine-Tuning of
Pretrained Models
- URL: http://arxiv.org/abs/2312.07028v1
- Date: Tue, 12 Dec 2023 07:26:36 GMT
- Title: Dynamic Corrective Self-Distillation for Better Fine-Tuning of
Pretrained Models
- Authors: Ibtihel Amara, Vinija Jain, and Aman Chadha
- Abstract summary: We tackle the issue of aggressive fine-tuning encountered during the process of transfer learning of pre-trained language models (PLMs)
Inspired by the adaptive boosting method in traditional machine learning, we present an effective dynamic corrective self-distillation approach to improve the fine-tuning of the PLMs.
Our technique involves performing a self-distillation mechanism where, at each iteration, the student model actively adapts and corrects itself by dynamically adjusting the weights assigned to individual data points.
- Score: 0.9217021281095907
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We tackle the challenging issue of aggressive fine-tuning encountered during
the process of transfer learning of pre-trained language models (PLMs) with
limited labeled downstream data. This problem primarily results in a decline in
performance on the subsequent task. Inspired by the adaptive boosting method in
traditional machine learning, we present an effective dynamic corrective
self-distillation (DCS) approach to improve the fine-tuning of the PLMs. Our
technique involves performing a self-distillation mechanism where, at each
iteration, the student model actively adapts and corrects itself by dynamically
adjusting the weights assigned to individual data points. This iterative
self-correcting process significantly enhances the overall fine-tuning
capability of PLMs, leading to improved performance and robustness. We
conducted comprehensive evaluations using the GLUE benchmark demonstrating the
efficacy of our method in enhancing the fine-tuning process for various PLMs
across diverse downstream tasks.
Related papers
- Self-Consistent Model-based Adaptation for Visual Reinforcement Learning [27.701421196547674]
Visual reinforcement learning agents face serious performance declines in real-world applications caused by visual distractions.
Existing methods rely on fine-tuning the policy's representations with hand-crafted augmentations.
We propose Self-Consistent Model-based Adaptation (SCMA), a novel method that fosters robust adaptation without modifying the policy.
arXiv Detail & Related papers (2025-02-14T05:23:56Z) - Dynamic Loss-Based Sample Reweighting for Improved Large Language Model Pretraining [55.262510814326035]
Existing reweighting strategies primarily focus on group-level data importance.
We introduce novel algorithms for dynamic, instance-level data reweighting.
Our framework allows us to devise reweighting strategies deprioritizing redundant or uninformative data.
arXiv Detail & Related papers (2025-02-10T17:57:15Z) - AdaWM: Adaptive World Model based Planning for Autonomous Driving [34.57859869929471]
World model based reinforcement learning (RL) has emerged as a promising approach for autonomous driving.
The pretrain-finetune paradigm is often used, where online RL is performance by a pretrained model and a policy learned offline.
We introduce AdaWM, an Adaptive World Model based planning method, featuring two key steps: (a) mismatch identification, which quantifies the mismatches and informs the finetuning strategy, and (b) alignment-driven finetuning, which selectively updates either the policy or the model as needed.
arXiv Detail & Related papers (2025-01-22T18:34:51Z) - Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate [105.86576388991713]
We introduce a normalized gradient difference (NGDiff) algorithm, enabling us to have better control over the trade-off between the objectives.
We provide a theoretical analysis and empirically demonstrate the superior performance of NGDiff among state-of-the-art unlearning methods on the TOFU and MUSE datasets.
arXiv Detail & Related papers (2024-10-29T14:41:44Z) - Reward-free World Models for Online Imitation Learning [25.304836126280424]
We propose a novel approach to online imitation learning that leverages reward-free world models.
Our method learns environmental dynamics entirely in latent spaces without reconstruction, enabling efficient and accurate modeling.
We evaluate our method on a diverse set of benchmarks, including DMControl, MyoSuite, and ManiSkill2, demonstrating superior empirical performance compared to existing approaches.
arXiv Detail & Related papers (2024-10-17T23:13:32Z) - Training Language Models to Self-Correct via Reinforcement Learning [98.35197671595343]
Self-correction has been found to be largely ineffective in modern large language models (LLMs)
We develop a multi-turn online reinforcement learning approach, SCoRe, that significantly improves an LLM's self-correction ability using entirely self-generated data.
We find that SCoRe achieves state-of-the-art self-correction performance, improving the base models' self-correction by 15.6% and 9.1% respectively on MATH and HumanEval.
arXiv Detail & Related papers (2024-09-19T17:16:21Z) - Enhancing Robustness of Vision-Language Models through Orthogonality Learning and Self-Regularization [77.62516752323207]
We introduce an orthogonal fine-tuning method for efficiently fine-tuning pretrained weights and enabling enhanced robustness and generalization.
A self-regularization strategy is further exploited to maintain the stability in terms of zero-shot generalization of VLMs, dubbed OrthSR.
For the first time, we revisit the CLIP and CoOp with our method to effectively improve the model on few-shot image classficiation scenario.
arXiv Detail & Related papers (2024-07-11T10:35:53Z) - Learning to Modulate pre-trained Models in RL [22.812215561012874]
Fine-tuning a pre-trained model often suffers from catastrophic forgetting.
Our study shows that with most fine-tuning approaches, the performance on pre-training tasks deteriorates significantly.
We propose a novel method, Learning-to-Modulate (L2M), that avoids the degradation of learned skills by modulating the information flow of the frozen pre-trained model.
arXiv Detail & Related papers (2023-06-26T17:53:05Z) - Model-Based Reinforcement Learning with Multi-Task Offline Pretraining [59.82457030180094]
We present a model-based RL method that learns to transfer potentially useful dynamics and action demonstrations from offline data to a novel task.
The main idea is to use the world models not only as simulators for behavior learning but also as tools to measure the task relevance.
We demonstrate the advantages of our approach compared with the state-of-the-art methods in Meta-World and DeepMind Control Suite.
arXiv Detail & Related papers (2023-06-06T02:24:41Z) - Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less
Forgetting [66.45372974713189]
We propose a recall and learn mechanism, which adopts the idea of multi-task learning and jointly learns pretraining tasks and downstream tasks.
Experiments show that our method achieves state-of-the-art performance on the GLUE benchmark.
We provide open-source RecAdam, which integrates the proposed mechanisms into Adam to facility the NLP community.
arXiv Detail & Related papers (2020-04-27T08:59:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.