Entailment as Robust Self-Learner
- URL: http://arxiv.org/abs/2305.17197v1
- Date: Fri, 26 May 2023 18:41:23 GMT
- Title: Entailment as Robust Self-Learner
- Authors: Jiaxin Ge, Hongyin Luo, Yoon Kim, James Glass
- Abstract summary: We design a prompting strategy that formulates a number of different NLU tasks as contextual entailment.
We propose the Simple Pseudo-Label Editing (SimPLE) algorithm for better pseudo-labeling quality in self-training.
- Score: 14.86757876218415
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Entailment has been recognized as an important metric for evaluating natural
language understanding (NLU) models, and recent studies have found that
entailment pretraining benefits weakly supervised fine-tuning. In this work, we
design a prompting strategy that formulates a number of different NLU tasks as
contextual entailment. This approach improves the zero-shot adaptation of
pretrained entailment models. Secondly, we notice that self-training
entailment-based models with unlabeled data can significantly improve the
adaptation performance on downstream tasks. To achieve more stable improvement,
we propose the Simple Pseudo-Label Editing (SimPLE) algorithm for better
pseudo-labeling quality in self-training. We also found that both pretrained
entailment-based models and the self-trained models are robust against
adversarial evaluation data. Experiments on binary and multi-class
classification tasks show that SimPLE leads to more robust self-training
results, indicating that the self-trained entailment models are more efficient
and trustworthy than large language models on language understanding tasks.
Related papers
- Self-training Language Models for Arithmetic Reasoning [0.0]
We explore the potential of improving models' reasoning capabilities without new data.
We find that models can substantially improve in both single-round (offline) and online self-training.
arXiv Detail & Related papers (2024-07-11T11:06:05Z) - Uncertainty Aware Learning for Language Model Alignment [97.36361196793929]
We propose uncertainty-aware learning (UAL) to improve the model alignment of different task scenarios.
We implement UAL in a simple fashion -- adaptively setting the label smoothing value of training according to the uncertainty of individual samples.
Experiments on widely used benchmarks demonstrate that our UAL significantly and consistently outperforms standard supervised fine-tuning.
arXiv Detail & Related papers (2024-06-07T11:37:45Z) - QualEval: Qualitative Evaluation for Model Improvement [82.73561470966658]
We propose QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement.
QualEval uses a powerful LLM reasoner and our novel flexible linear programming solver to generate human-readable insights.
We demonstrate that leveraging its insights, for example, improves the absolute performance of the Llama 2 model by up to 15% points relative.
arXiv Detail & Related papers (2023-11-06T00:21:44Z) - Uncertainty-aware Parameter-Efficient Self-training for Semi-supervised
Language Understanding [38.11411155621616]
We study self-training as one of the predominant semi-supervised learning approaches.
We present UPET, a novel Uncertainty-aware self-Training framework.
We show that UPET achieves a substantial improvement in terms of performance and efficiency.
arXiv Detail & Related papers (2023-10-19T02:18:29Z) - TWINS: A Fine-Tuning Framework for Improved Transferability of
Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks.
We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework.
TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z) - Effective Robustness against Natural Distribution Shifts for Models with
Different Training Data [113.21868839569]
"Effective robustness" measures the extra out-of-distribution robustness beyond what can be predicted from the in-distribution (ID) performance.
We propose a new evaluation metric to evaluate and compare the effective robustness of models trained on different data.
arXiv Detail & Related papers (2023-02-02T19:28:41Z) - Self-Distillation for Further Pre-training of Transformers [83.84227016847096]
We propose self-distillation as a regularization for a further pre-training stage.
We empirically validate the efficacy of self-distillation on a variety of benchmark datasets for image and text classification tasks.
arXiv Detail & Related papers (2022-09-30T02:25:12Z) - Language Models in the Loop: Incorporating Prompting into Weak
Supervision [11.10422546502386]
We propose a new strategy for applying large pre-trained language models to novel tasks when labeled training data is limited.
Instead of applying the model in a typical zero-shot or few-shot fashion, we treat the model as the basis for labeling functions in a weak supervision framework.
arXiv Detail & Related papers (2022-05-04T20:42:40Z) - NoiER: An Approach for Training more Reliable Fine-TunedDownstream Task
Models [54.184609286094044]
We propose noise entropy regularisation (NoiER) as an efficient learning paradigm that solves the problem without auxiliary models and additional data.
The proposed approach improved traditional OOD detection evaluation metrics by 55% on average compared to the original fine-tuned models.
arXiv Detail & Related papers (2021-08-29T06:58:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.