Related papers: Predicting Fine-Tuning Performance with Probing

Predicting Fine-Tuning Performance with Probing

URL: http://arxiv.org/abs/2210.07352v1
Date: Thu, 13 Oct 2022 20:58:14 GMT
Title: Predicting Fine-Tuning Performance with Probing
Authors: Zining Zhu, Soroosh Shahtalebi, Frank Rudzicz
Abstract summary: This paper explores the utility of probing deep NLP models to extract a proxy signal widely used in model development. We find that it is possible to use the accuracies of only three probing tests to predict the fine-tuning performance with errors $40%$ - $80%$ smaller than baselines.
Score: 18.129450295108423
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large NLP models have recently shown impressive performance in language understanding tasks, typically evaluated by their fine-tuned performance. Alternatively, probing has received increasing attention as being a lightweight method for interpreting the intrinsic mechanisms of large NLP models. In probing, post-hoc classifiers are trained on "out-of-domain" datasets that diagnose specific abilities. While probing the language models has led to insightful findings, they appear disjointed from the development of models. This paper explores the utility of probing deep NLP models to extract a proxy signal widely used in model development -- the fine-tuning performance. We find that it is possible to use the accuracies of only three probing tests to predict the fine-tuning performance with errors $40\%$ - $80\%$ smaller than baselines. We further discuss possible avenues where probing can empower the development of deep NLP models.

Related papers

DBR: Divergence-Based Regularization for Debiasing Natural Language Understanding Models [50.54264918467997]
Pre-trained language models (PLMs) have achieved impressive results on various natural language processing tasks. Recent research has revealed that these models often rely on superficial features and shortcuts instead of developing a genuine understanding of language. We propose Divergence Based Regularization (DBR) to mitigate this shortcut learning behavior.
arXiv Detail & Related papers (2025-02-25T16:44:10Z)
Predicting the Performance of Black-box LLMs through Self-Queries [60.87193950962585]
Large language models (LLMs) are increasingly relied on in AI systems, predicting when they make mistakes is crucial. In this paper, we extract features of LLMs in a black-box manner by using follow-up prompts and taking the probabilities of different responses as representations. We demonstrate that training a linear model on these low-dimensional representations produces reliable predictors of model performance at the instance level.
arXiv Detail & Related papers (2025-01-02T22:26:54Z)
Large Language Models as Annotators: Enhancing Generalization of NLP Models at Minimal Cost [6.662800021628275]
We study the use of large language models (LLMs) for annotating inputs and improving the generalization of NLP models. We propose a sampling strategy based on the difference in prediction scores between the base model and the finetuned NLP model.
arXiv Detail & Related papers (2023-06-27T19:29:55Z)
SPARSEFIT: Few-shot Prompting with Sparse Fine-tuning for Jointly Generating Predictions and Natural Language Explanations [17.972111965568384]
Fine-tuning pre-trained language models (PLMs) in conjunction with prompt-based learning has recently shown promising results. We propose SparseFit, a few-shot fine-tuning strategy that leverages discrete prompts to jointly generate predictions and NLEs. We find that fine-tuning only 6.8% of the model parameters leads to competitive results for both the task performance and the quality of the generated NLEs.
arXiv Detail & Related papers (2023-05-22T17:06:41Z)
Prompt-Augmented Linear Probing: Scaling beyond the Limit of Few-shot In-Context Learners [25.262774179224945]
This paper proposes prompt-augmented linear probing (PALP), a hybrid of linear probing and in-context learning (ICL) PALP significantly enhances the input representations closing the gap between ICL in the data-hungry scenario and fine-tuning in the data-abundant scenario with little training overhead.
arXiv Detail & Related papers (2022-12-21T09:37:05Z)
A Kernel-Based View of Language Model Fine-Tuning [94.75146965041131]
We investigate whether the Neural Tangent Kernel (NTK) describes fine-tuning of pre-trained LMs. We show that formulating the downstream task as a masked word prediction problem through prompting often induces kernel-based dynamics during fine-tuning.
arXiv Detail & Related papers (2022-10-11T17:34:32Z)
Improving Pre-trained Language Model Fine-tuning with Noise Stability Regularization [94.4409074435894]
We propose a novel and effective fine-tuning framework, named Layerwise Noise Stability Regularization (LNSR) Specifically, we propose to inject the standard Gaussian noise and regularize hidden representations of the fine-tuned model. We demonstrate the advantages of the proposed method over other state-of-the-art algorithms including L2-SP, Mixout and SMART.
arXiv Detail & Related papers (2022-06-12T04:42:49Z)
Improving the Adversarial Robustness of NLP Models by Information Bottleneck [112.44039792098579]
Non-robust features can be easily manipulated by adversaries to fool NLP models. In this study, we explore the feasibility of capturing task-specific robust features, while eliminating the non-robust ones by using the information bottleneck theory. We show that the models trained with our information bottleneck-based method are able to achieve a significant improvement in robust accuracy.
arXiv Detail & Related papers (2022-06-11T12:12:20Z)
Efficient Nearest Neighbor Language Models [114.40866461741795]
Non-parametric neural language models (NLMs) learn predictive distributions of text utilizing an external datastore. We show how to achieve up to a 6x speed-up in inference speed while retaining comparable performance.
arXiv Detail & Related papers (2021-09-09T12:32:28Z)
NoiER: An Approach for Training more Reliable Fine-TunedDownstream Task Models [54.184609286094044]
We propose noise entropy regularisation (NoiER) as an efficient learning paradigm that solves the problem without auxiliary models and additional data. The proposed approach improved traditional OOD detection evaluation metrics by 55% on average compared to the original fine-tuned models.
arXiv Detail & Related papers (2021-08-29T06:58:28Z)
Evaluating the Robustness of Neural Language Models to Input Perturbations [7.064032374579076]
In this study, we design and implement various types of character-level and word-level perturbation methods to simulate noisy input texts. We investigate the ability of high-performance language models such as BERT, XLNet, RoBERTa, and ELMo in handling different types of input perturbations. The results suggest that language models are sensitive to input perturbations and their performance can decrease even when small changes are introduced.
arXiv Detail & Related papers (2021-08-27T12:31:17Z)
Explaining and Improving Model Behavior with k Nearest Neighbor Representations [107.24850861390196]
We propose using k nearest neighbor representations to identify training examples responsible for a model's predictions. We show that kNN representations are effective at uncovering learned spurious associations. Our results indicate that the kNN approach makes the finetuned model more robust to adversarial inputs.
arXiv Detail & Related papers (2020-10-18T16:55:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.