Related papers: Sensitivity as a Complexity Measure for Sequence Classification Tasks

Sensitivity as a Complexity Measure for Sequence Classification Tasks

URL: http://arxiv.org/abs/2104.10343v1
Date: Wed, 21 Apr 2021 03:56:59 GMT
Title: Sensitivity as a Complexity Measure for Sequence Classification Tasks
Authors: Michael Hahn, Dan Jurafsky, Richard Futrell
Abstract summary: We argue that standard sequence classification methods are biased towards learning low-sensitivity functions, so that tasks requiring high sensitivity are more difficult. We estimate sensitivity on 15 NLP tasks, finding that sensitivity is higher on challenging tasks collected in GLUE than on simple text classification tasks.
Score: 24.246784593571626
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce a theoretical framework for understanding and predicting the complexity of sequence classification tasks, using a novel extension of the theory of Boolean function sensitivity. The sensitivity of a function, given a distribution over input sequences, quantifies the number of disjoint subsets of the input sequence that can each be individually changed to change the output. We argue that standard sequence classification methods are biased towards learning low-sensitivity functions, so that tasks requiring high sensitivity are more difficult. To that end, we show analytically that simple lexical classifiers can only express functions of bounded sensitivity, and we show empirically that low-sensitivity functions are easier to learn for LSTMs. We then estimate sensitivity on 15 NLP tasks, finding that sensitivity is higher on challenging tasks collected in GLUE than on simple text classification tasks, and that sensitivity predicts the performance both of simple lexical classifiers and of vanilla BiLSTMs without pretrained contextualized embeddings. Within a task, sensitivity predicts which inputs are hard for such simple models. Our results suggest that the success of massively pretrained contextual representations stems in part because they provide representations from which information can be extracted by low-sensitivity decoders.

Related papers

SMAB: MAB based word Sensitivity Estimation Framework and its Applications in Adversarial Text Generation [10.111657705438654]
We introduce a Sensitivity-based Multi-Armed Bandit framework (SMAB) for calculating word-level local (sentence-level) and global (aggregated) sensitivities. We show that our algorithm indeed captures intuitively high and low-sensitive words. We also show that sensitivity can serve as a proxy for accuracy in the absence of gold data.
arXiv Detail & Related papers (2025-02-10T22:46:57Z)
ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs [72.13489820420726]
ProSA is a framework designed to evaluate and comprehend prompt sensitivity in large language models. Our study uncovers that prompt sensitivity fluctuates across datasets and models, with larger models exhibiting enhanced robustness.
arXiv Detail & Related papers (2024-10-16T09:38:13Z)
POSIX: A Prompt Sensitivity Index For Large Language Models [22.288479270814484]
Large Language Models (LLMs) are surprisingly sensitive to minor variations in prompts. POSIX is a novel PrOmpt Sensitivity IndeX as a reliable measure of prompt sensitivity.
arXiv Detail & Related papers (2024-10-03T04:01:14Z)
How are Prompts Different in Terms of Sensitivity? [50.67313477651395]
We present a comprehensive prompt analysis based on the sensitivity of a function. We use gradient-based saliency scores to empirically demonstrate how different prompts affect the relevance of input tokens to the output. We introduce sensitivity-aware decoding which incorporates sensitivity estimation as a penalty term in the standard greedy decoding.
arXiv Detail & Related papers (2023-11-13T10:52:01Z)
RaSa: Relation and Sensitivity Aware Representation Learning for Text-based Person Search [51.09723403468361]
We propose a Relation and Sensitivity aware representation learning method (RaSa) RaSa includes two novel tasks: Relation-Aware learning (RA) and Sensitivity-Aware learning (SA) Experiments demonstrate that RaSa outperforms existing state-of-the-art methods by 6.94%, 4.45% and 15.35% in terms of Rank@1 on datasets.
arXiv Detail & Related papers (2023-05-23T03:53:57Z)
On the Relation between Sensitivity and Accuracy in In-context Learning [41.27837171531926]
In-context learning (ICL) suffers from oversensitivity to the prompt, making it unreliable in real-world scenarios. We study the sensitivity of ICL with respect to multiple perturbation types. We propose textscSenSel, a few-shot selective prediction method that abstains from sensitive predictions.
arXiv Detail & Related papers (2022-09-16T00:52:34Z)
Learning Disentangled Textual Representations via Statistical Measures of Similarity [35.74568888409149]
We introduce a family of regularizers for learning disentangled representations that do not require training. Our novel regularizers do not require additional training, are faster and do not involve additional tuning.
arXiv Detail & Related papers (2022-05-07T08:06:22Z)
Inducing Transformer's Compositional Generalization Ability via Auxiliary Sequence Prediction Tasks [86.10875837475783]
Systematic compositionality is an essential mechanism in human language, allowing the recombination of known parts to create novel expressions. Existing neural models have been shown to lack this basic ability in learning symbolic structures. We propose two auxiliary sequence prediction tasks that track the progress of function and argument semantics.
arXiv Detail & Related papers (2021-09-30T16:41:19Z)
Learning to Ask Conversational Questions by Optimizing Levenshtein Distance [83.53855889592734]
We introduce a Reinforcement Iterative Sequence Editing (RISE) framework that optimize the minimum Levenshtein distance (MLD) through explicit editing actions. RISE is able to pay attention to tokens that are related to conversational characteristics. Experimental results on two benchmark datasets show that RISE significantly outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-06-30T08:44:19Z)
Predicting What You Already Know Helps: Provable Self-Supervised Learning [60.27658820909876]
Self-supervised representation learning solves auxiliary prediction tasks (known as pretext tasks) without requiring labeled data. We show a mechanism exploiting the statistical connections between certain em reconstruction-based pretext tasks that guarantee to learn a good representation. We prove the linear layer yields small approximation error even for complex ground truth function class.
arXiv Detail & Related papers (2020-08-03T17:56:13Z)
SAM: The Sensitivity of Attribution Methods to Hyperparameters [13.145335512841557]
We argue that a key desideratum of an explanation method is its robustness to input hyperparameters which are often randomly set or empirically tuned. In this paper, we provide a thorough empirical study on the sensitivity of existing attribution methods.
arXiv Detail & Related papers (2020-03-04T22:09:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.