Variational Information Bottleneck for Effective Low-Resource
Fine-Tuning
- URL: http://arxiv.org/abs/2106.05469v1
- Date: Thu, 10 Jun 2021 03:08:13 GMT
- Title: Variational Information Bottleneck for Effective Low-Resource
Fine-Tuning
- Authors: Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
- Abstract summary: We propose to use Variational Information Bottleneck (VIB) to suppress irrelevant features when fine-tuning on low-resource target tasks.
We show that our VIB model finds sentence representations that are more robust to biases in natural language inference datasets.
- Score: 40.66716433803935
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While large-scale pretrained language models have obtained impressive results
when fine-tuned on a wide variety of tasks, they still often suffer from
overfitting in low-resource scenarios. Since such models are general-purpose
feature extractors, many of these features are inevitably irrelevant for a
given target task. We propose to use Variational Information Bottleneck (VIB)
to suppress irrelevant features when fine-tuning on low-resource target tasks,
and show that our method successfully reduces overfitting. Moreover, we show
that our VIB model finds sentence representations that are more robust to
biases in natural language inference datasets, and thereby obtains better
generalization to out-of-domain datasets. Evaluation on seven low-resource
datasets in different tasks shows that our method significantly improves
transfer learning in low-resource scenarios, surpassing prior work. Moreover,
it improves generalization on 13 out of 15 out-of-domain natural language
inference benchmarks. Our code is publicly available in
https://github.com/rabeehk/vibert.
Related papers
- Controlled Randomness Improves the Performance of Transformer Models [4.678970068275123]
We introduce controlled randomness, i.e. noise, into the training process to improve fine-tuning language models.
We find that adding such noise can improve the performance in our two downstream tasks of joint named entity recognition and relation extraction and text summarization.
arXiv Detail & Related papers (2023-10-20T14:12:55Z) - Combining Data Generation and Active Learning for Low-Resource Question Answering [23.755283239897132]
We propose a novel approach that combines data augmentation via question-answer generation with Active Learning to improve performance in low-resource settings.
Our findings show that our novel approach, where humans are incorporated in a data generation approach, boosts performance in the low-resource, domain-specific setting.
arXiv Detail & Related papers (2022-11-27T16:31:33Z) - Improving Pre-trained Language Model Fine-tuning with Noise Stability
Regularization [94.4409074435894]
We propose a novel and effective fine-tuning framework, named Layerwise Noise Stability Regularization (LNSR)
Specifically, we propose to inject the standard Gaussian noise and regularize hidden representations of the fine-tuned model.
We demonstrate the advantages of the proposed method over other state-of-the-art algorithms including L2-SP, Mixout and SMART.
arXiv Detail & Related papers (2022-06-12T04:42:49Z) - A Little Pretraining Goes a Long Way: A Case Study on Dependency Parsing
Task for Low-resource Morphologically Rich Languages [14.694800341598368]
We focus on dependency parsing for morphological rich languages (MRLs) in a low-resource setting.
To address these challenges, we propose simple auxiliary tasks for pretraining.
We perform experiments on 10 MRLs in low-resource settings to measure the efficacy of our proposed pretraining method.
arXiv Detail & Related papers (2021-02-12T14:26:58Z) - Fine-tuning BERT for Low-Resource Natural Language Understanding via
Active Learning [30.5853328612593]
In this work, we explore fine-tuning methods of BERT -- a pre-trained Transformer based language model.
Our experimental results show an advantage in model performance by maximizing the approximate knowledge gain of the model.
We analyze the benefits of freezing layers of the language model during fine-tuning to reduce the number of trainable parameters.
arXiv Detail & Related papers (2020-12-04T08:34:39Z) - Comparison of Interactive Knowledge Base Spelling Correction Models for
Low-Resource Languages [81.90356787324481]
Spelling normalization for low resource languages is a challenging task because the patterns are hard to predict.
This work shows a comparison of a neural model and character language models with varying amounts on target language data.
Our usage scenario is interactive correction with nearly zero amounts of training examples, improving models as more data is collected.
arXiv Detail & Related papers (2020-10-20T17:31:07Z) - Low-Resource Domain Adaptation for Compositional Task-Oriented Semantic
Parsing [85.35582118010608]
Task-oriented semantic parsing is a critical component of virtual assistants.
Recent advances in deep learning have enabled several approaches to successfully parse more complex queries.
We propose a novel method that outperforms a supervised neural model at a 10-fold data reduction.
arXiv Detail & Related papers (2020-10-07T17:47:53Z) - Building Low-Resource NER Models Using Non-Speaker Annotation [58.78968578460793]
Cross-lingual methods have had notable success in addressing these concerns.
We propose a complementary approach to building low-resource Named Entity Recognition (NER) models using non-speaker'' (NS) annotations.
We show that use of NS annotators produces results that are consistently on par or better than cross-lingual methods built on modern contextual representations.
arXiv Detail & Related papers (2020-06-17T03:24:38Z) - Discrete Latent Variable Representations for Low-Resource Text
Classification [47.936293924113855]
We consider approaches to learning discrete latent variable models for text.
We compare the performance of the learned representations as features for low-resource document and sentence classification.
An amortized variant of Hard EM performs particularly well in the lowest-resource regimes.
arXiv Detail & Related papers (2020-06-11T06:55:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.