Related papers: On the Relation between Sensitivity and Accuracy in In-context Learning

On the Relation between Sensitivity and Accuracy in In-context Learning

URL: http://arxiv.org/abs/2209.07661v3
Date: Sat, 27 Jan 2024 08:07:34 GMT
Title: On the Relation between Sensitivity and Accuracy in In-context Learning
Authors: Yanda Chen, Chen Zhao, Zhou Yu, Kathleen McKeown, He He
Abstract summary: In-context learning (ICL) suffers from oversensitivity to the prompt, making it unreliable in real-world scenarios. We study the sensitivity of ICL with respect to multiple perturbation types. We propose textscSenSel, a few-shot selective prediction method that abstains from sensitive predictions.
Score: 41.27837171531926
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In-context learning (ICL) suffers from oversensitivity to the prompt, making it unreliable in real-world scenarios. We study the sensitivity of ICL with respect to multiple perturbation types. First, we find that label bias obscures the true sensitivity, and therefore prior work may have significantly underestimated ICL sensitivity. Second, we observe a strong negative correlation between ICL sensitivity and accuracy: predictions sensitive to perturbations are less likely to be correct. Motivated by these findings, we propose \textsc{SenSel}, a few-shot selective prediction method that abstains from sensitive predictions. Experiments on ten classification datasets show that \textsc{SenSel} consistently outperforms two commonly used confidence-based and entropy-based baselines on abstention decisions.

Related papers

SMAB: MAB based word Sensitivity Estimation Framework and its Applications in Adversarial Text Generation [10.111657705438654]
We introduce a Sensitivity-based Multi-Armed Bandit framework (SMAB) for calculating word-level local (sentence-level) and global (aggregated) sensitivities. We show that our algorithm indeed captures intuitively high and low-sensitive words. We also show that sensitivity can serve as a proxy for accuracy in the absence of gold data.
arXiv Detail & Related papers (2025-02-10T22:46:57Z)
ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs [72.13489820420726]
ProSA is a framework designed to evaluate and comprehend prompt sensitivity in large language models. Our study uncovers that prompt sensitivity fluctuates across datasets and models, with larger models exhibiting enhanced robustness.
arXiv Detail & Related papers (2024-10-16T09:38:13Z)
A Neural Framework for Generalized Causal Sensitivity Analysis [78.71545648682705]
We propose NeuralCSA, a neural framework for causal sensitivity analysis. We provide theoretical guarantees that NeuralCSA is able to infer valid bounds on the causal query of interest.
arXiv Detail & Related papers (2023-11-27T17:40:02Z)
How are Prompts Different in Terms of Sensitivity? [50.67313477651395]
We present a comprehensive prompt analysis based on the sensitivity of a function. We use gradient-based saliency scores to empirically demonstrate how different prompts affect the relevance of input tokens to the output. We introduce sensitivity-aware decoding which incorporates sensitivity estimation as a penalty term in the standard greedy decoding.
arXiv Detail & Related papers (2023-11-13T10:52:01Z)
The Memory Perturbation Equation: Understanding Model's Sensitivity to Data [16.98312108418346]
We present the Memory-Perturbation Equation (MPE) which relates model's sensitivity to perturbation in its training data. Our empirical results show that sensitivity estimates obtained during training can be used to faithfully predict generalization on unseen test data.
arXiv Detail & Related papers (2023-10-30T05:12:24Z)
Sharp Bounds for Generalized Causal Sensitivity Analysis [30.77874108094485]
We propose a unified framework for causal sensitivity analysis under unobserved confounding. This includes (conditional) average treatment effects, effects for mediation analysis and path analysis, and distributional effects. Our bounds for (conditional) average treatment effects coincide with recent optimality results for causal sensitivity analysis.
arXiv Detail & Related papers (2023-05-26T14:44:32Z)
Language Model Classifier Aligns Better with Physician Word Sensitivity than XGBoost on Readmission Prediction [86.15787587540132]
We introduce sensitivity score, a metric that scrutinizes models' behaviors at the vocabulary level. Our experiments compare the decision-making logic of clinicians and classifiers based on rank correlations of sensitivity scores.
arXiv Detail & Related papers (2022-11-13T23:59:11Z)
Balancing Robustness and Sensitivity using Feature Contrastive Learning [95.86909855412601]
Methods that promote robustness can hurt the model's sensitivity to rare or underrepresented patterns. We propose Feature Contrastive Learning (FCL) that encourages a model to be more sensitive to the features that have higher contextual utility.
arXiv Detail & Related papers (2021-05-19T20:53:02Z)
Sensitivity as a Complexity Measure for Sequence Classification Tasks [24.246784593571626]
We argue that standard sequence classification methods are biased towards learning low-sensitivity functions, so that tasks requiring high sensitivity are more difficult. We estimate sensitivity on 15 NLP tasks, finding that sensitivity is higher on challenging tasks collected in GLUE than on simple text classification tasks.
arXiv Detail & Related papers (2021-04-21T03:56:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.