Semantic Sensitivities and Inconsistent Predictions: Measuring the
Fragility of NLI Models
- URL: http://arxiv.org/abs/2401.14440v2
- Date: Wed, 31 Jan 2024 10:52:52 GMT
- Title: Semantic Sensitivities and Inconsistent Predictions: Measuring the
Fragility of NLI Models
- Authors: Erik Arakelyan, Zhaoqi Liu, Isabelle Augenstein
- Abstract summary: State-of-the-art Natural Language Inference (NLI) models are sensitive towards minor semantics preserving surface-form variations.
We show that semantic sensitivity causes performance degradations of $12.92%$ and $23.71%$ average over $textbfin-$ and $textbfout-of-$ domain settings.
- Score: 44.56781176879151
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent studies of the emergent capabilities of transformer-based Natural
Language Understanding (NLU) models have indicated that they have an
understanding of lexical and compositional semantics. We provide evidence that
suggests these claims should be taken with a grain of salt: we find that
state-of-the-art Natural Language Inference (NLI) models are sensitive towards
minor semantics preserving surface-form variations, which lead to sizable
inconsistent model decisions during inference. Notably, this behaviour differs
from valid and in-depth comprehension of compositional semantics, however does
neither emerge when evaluating model accuracy on standard benchmarks nor when
probing for syntactic, monotonic, and logically robust reasoning. We propose a
novel framework to measure the extent of semantic sensitivity. To this end, we
evaluate NLI models on adversarially generated examples containing minor
semantics-preserving surface-form input noise. This is achieved using
conditional text generation, with the explicit condition that the NLI model
predicts the relationship between the original and adversarial inputs as a
symmetric equivalence entailment. We systematically study the effects of the
phenomenon across NLI models for $\textbf{in-}$ and $\textbf{out-of-}$ domain
settings. Our experiments show that semantic sensitivity causes performance
degradations of $12.92\%$ and $23.71\%$ average over $\textbf{in-}$ and
$\textbf{out-of-}$ domain settings, respectively. We further perform ablation
studies, analysing this phenomenon across models, datasets, and variations in
inference and show that semantic sensitivity can lead to major inconsistency
within model predictions.
Related papers
- Estimating the Causal Effects of Natural Logic Features in Transformer-Based NLI Models [16.328341121232484]
We apply causal effect estimation strategies to measure the effect of context interventions.
We investigate robustness to irrelevant changes and sensitivity to impactful changes of Transformers.
arXiv Detail & Related papers (2024-04-03T10:22:35Z) - Estimating the Causal Effects of Natural Logic Features in Neural NLI
Models [2.363388546004777]
We zone in on specific patterns of reasoning with enough structure and regularity to be able to identify and quantify systematic reasoning failures in widely-used models.
We apply causal effect estimation strategies to measure the effect of context interventions.
Following related work on causal analysis of NLP models in different settings, we adapt the methodology for the NLI task to construct comparative model profiles.
arXiv Detail & Related papers (2023-05-15T12:01:09Z) - In and Out-of-Domain Text Adversarial Robustness via Label Smoothing [64.66809713499576]
We study the adversarial robustness provided by various label smoothing strategies in foundational models for diverse NLP tasks.
Our experiments show that label smoothing significantly improves adversarial robustness in pre-trained models like BERT, against various popular attacks.
We also analyze the relationship between prediction confidence and robustness, showing that label smoothing reduces over-confident errors on adversarial examples.
arXiv Detail & Related papers (2022-12-20T14:06:50Z) - Logical Satisfiability of Counterfactuals for Faithful Explanations in
NLI [60.142926537264714]
We introduce the methodology of Faithfulness-through-Counterfactuals.
It generates a counterfactual hypothesis based on the logical predicates expressed in the explanation.
It then evaluates if the model's prediction on the counterfactual is consistent with that expressed logic.
arXiv Detail & Related papers (2022-05-25T03:40:59Z) - A comprehensive comparative evaluation and analysis of Distributional
Semantic Models [61.41800660636555]
We perform a comprehensive evaluation of type distributional vectors, either produced by static DSMs or obtained by averaging the contextualized vectors generated by BERT.
The results show that the alleged superiority of predict based models is more apparent than real, and surely not ubiquitous.
We borrow from cognitive neuroscience the methodology of Representational Similarity Analysis (RSA) to inspect the semantic spaces generated by distributional models.
arXiv Detail & Related papers (2021-05-20T15:18:06Z) - Exploring Lexical Irregularities in Hypothesis-Only Models of Natural
Language Inference [5.283529004179579]
Natural Language Inference (NLI) or Recognizing Textual Entailment (RTE) is the task of predicting the entailment relation between a pair of sentences.
Models that understand entailment should encode both, the premise and the hypothesis.
Experiments by Poliak et al. revealed a strong preference of these models towards patterns observed only in the hypothesis.
arXiv Detail & Related papers (2021-01-19T01:08:06Z) - Recoding latent sentence representations -- Dynamic gradient-based
activation modification in RNNs [0.0]
In RNNs, encoding information in a suboptimal way can impact the quality of representations based on later elements in the sequence.
I propose an augmentation to standard RNNs in form of a gradient-based correction mechanism.
I conduct different experiments in the context of language modeling, where the impact of using such a mechanism is examined in detail.
arXiv Detail & Related papers (2021-01-03T17:54:17Z) - Explaining and Improving Model Behavior with k Nearest Neighbor
Representations [107.24850861390196]
We propose using k nearest neighbor representations to identify training examples responsible for a model's predictions.
We show that kNN representations are effective at uncovering learned spurious associations.
Our results indicate that the kNN approach makes the finetuned model more robust to adversarial inputs.
arXiv Detail & Related papers (2020-10-18T16:55:25Z) - Understanding Neural Abstractive Summarization Models via Uncertainty [54.37665950633147]
seq2seq abstractive summarization models generate text in a free-form manner.
We study the entropy, or uncertainty, of the model's token-level predictions.
We show that uncertainty is a useful perspective for analyzing summarization and text generation models more broadly.
arXiv Detail & Related papers (2020-10-15T16:57:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.