Generalizable and Stable Finetuning of Pretrained Language Models on Low-Resource Texts
- URL: http://arxiv.org/abs/2403.12918v1
- Date: Tue, 19 Mar 2024 17:21:29 GMT
- Title: Generalizable and Stable Finetuning of Pretrained Language Models on Low-Resource Texts
- Authors: Sai Ashish Somayajula, Youwei Liang, Abhishek Singh, Li Zhang, Pengtao Xie,
- Abstract summary: We propose a regularization method based on attention-guided weight mixup for finetuning PLMs.
Our approach represents each network weight as a mixup of task-specific weight and pretrained weight, controlled by a learnable attention parameter.
We employ a bi-level optimization framework on two separate splits of the training dataset, improving generalization and combating overfitting.
- Score: 23.94064492903792
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pretrained Language Models (PLMs) have advanced Natural Language Processing (NLP) tasks significantly, but finetuning PLMs on low-resource datasets poses significant challenges such as instability and overfitting. Previous methods tackle these issues by finetuning a strategically chosen subnetwork on a downstream task, while keeping the remaining weights fixed to the pretrained weights. However, they rely on a suboptimal criteria for sub-network selection, leading to suboptimal solutions. To address these limitations, we propose a regularization method based on attention-guided weight mixup for finetuning PLMs. Our approach represents each network weight as a mixup of task-specific weight and pretrained weight, controlled by a learnable attention parameter, providing finer control over sub-network selection. Furthermore, we employ a bi-level optimization (BLO) based framework on two separate splits of the training dataset, improving generalization and combating overfitting. We validate the efficacy of our proposed method through extensive experiments, demonstrating its superiority over previous methods, particularly in the context of finetuning PLMs on low-resource datasets.
Related papers
- Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning [62.984693936073974]
Value-based reinforcement learning can learn effective policies for a wide range of multi-turn problems.
Current value-based RL methods have proven particularly challenging to scale to the setting of large language models.
We propose a novel offline RL algorithm that addresses these drawbacks, casting Q-learning as a modified supervised fine-tuning problem.
arXiv Detail & Related papers (2024-11-07T21:36:52Z) - TapWeight: Reweighting Pretraining Objectives for Task-Adaptive Pretraining [34.93043212352875]
TapWeight is a task-adaptive pretraining framework which automatically determines the optimal importance of each pretraining objective.
We applied TapWeight to both molecular property prediction and natural language understanding tasks, significantly surpassing baseline methods.
arXiv Detail & Related papers (2024-10-13T20:56:13Z) - Meta-TTT: A Meta-learning Minimax Framework For Test-Time Training [5.9631503543049895]
Test-time domain adaptation is a challenging task that aims to adapt a pre-trained model to limited, unlabeled target data during inference.
This paper introduces a meta-learning minimax framework for test-time training on batch normalization layers.
arXiv Detail & Related papers (2024-10-02T16:16:05Z) - Enhancing Robustness of Vision-Language Models through Orthogonality Learning and Self-Regularization [77.62516752323207]
We introduce an orthogonal fine-tuning method for efficiently fine-tuning pretrained weights and enabling enhanced robustness and generalization.
A self-regularization strategy is further exploited to maintain the stability in terms of zero-shot generalization of VLMs, dubbed OrthSR.
For the first time, we revisit the CLIP and CoOp with our method to effectively improve the model on few-shot image classficiation scenario.
arXiv Detail & Related papers (2024-07-11T10:35:53Z) - Unleashing the Power of Pre-trained Language Models for Offline
Reinforcement Learning [54.682106515794864]
offline reinforcement learning (RL) aims to find a near-optimal policy using pre-collected datasets.
This paper introduces $textbfLanguage Models for $textbfMo$tion Control ($textbfLaMo$), a general framework based on Decision Transformers to use pre-trained Language Models (LMs) for offline RL.
Empirical results indicate $textbfLaMo$ achieves state-of-the-art performance in sparse-reward tasks.
arXiv Detail & Related papers (2023-10-31T16:24:17Z) - Learning to Re-weight Examples with Optimal Transport for Imbalanced
Classification [74.62203971625173]
Imbalanced data pose challenges for deep learning based classification models.
One of the most widely-used approaches for tackling imbalanced data is re-weighting.
We propose a novel re-weighting method based on optimal transport (OT) from a distributional point of view.
arXiv Detail & Related papers (2022-08-05T01:23:54Z) - Improving Pre-trained Language Model Fine-tuning with Noise Stability
Regularization [94.4409074435894]
We propose a novel and effective fine-tuning framework, named Layerwise Noise Stability Regularization (LNSR)
Specifically, we propose to inject the standard Gaussian noise and regularize hidden representations of the fine-tuned model.
We demonstrate the advantages of the proposed method over other state-of-the-art algorithms including L2-SP, Mixout and SMART.
arXiv Detail & Related papers (2022-06-12T04:42:49Z) - CSS-LM: A Contrastive Framework for Semi-supervised Fine-tuning of
Pre-trained Language Models [59.49705076369856]
We introduce a novel framework to improve the fine-tuning phase of pre-trained language models (PLMs)
We retrieve positive and negative instances from large-scale unlabeled corpora according to their domain-level and class-level semantic relatedness to a task.
We then perform contrastive semi-supervised learning on both the retrieved unlabeled and original labeled instances to help PLMs capture crucial task-related semantic features.
arXiv Detail & Related papers (2021-02-07T09:27:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.