Related papers: Neural Natural Language Inference Models Partially Embed Theories of Lexical Entailment and Negation

Neural Natural Language Inference Models Partially Embed Theories of Lexical Entailment and Negation

URL: http://arxiv.org/abs/2004.14623v4
Date: Sat, 21 Nov 2020 01:53:49 GMT
Title: Neural Natural Language Inference Models Partially Embed Theories of Lexical Entailment and Negation
Authors: Atticus Geiger, Kyle Richardson, and Christopher Potts
Abstract summary: We present Monotonicity NLI (MoNLI), a new naturalistic dataset focused on lexical entailment and negation. In behavioral evaluations, we find that models trained on general-purpose NLI datasets fail systematically on MoNLI examples containing negation. In structural evaluations, we look for evidence that our top-performing BERT-based model has learned to implement the monotonicity algorithm behind MoNLI.
Score: 14.431925736607043
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We address whether neural models for Natural Language Inference (NLI) can learn the compositional interactions between lexical entailment and negation, using four methods: the behavioral evaluation methods of (1) challenge test sets and (2) systematic generalization tasks, and the structural evaluation methods of (3) probes and (4) interventions. To facilitate this holistic evaluation, we present Monotonicity NLI (MoNLI), a new naturalistic dataset focused on lexical entailment and negation. In our behavioral evaluations, we find that models trained on general-purpose NLI datasets fail systematically on MoNLI examples containing negation, but that MoNLI fine-tuning addresses this failure. In our structural evaluations, we look for evidence that our top-performing BERT-based model has learned to implement the monotonicity algorithm behind MoNLI. Probes yield evidence consistent with this conclusion, and our intervention experiments bolster this, showing that the causal dynamics of the model mirror the causal dynamics of this algorithm on subsets of MoNLI. This suggests that the BERT model at least partially embeds a theory of lexical entailment and negation at an algorithmic level.

Related papers

A Comprehensive Taxonomy of Negation for NLP and Neural Retrievers [61.086220009192424]
We introduce a taxonomy of negation that derives from philosophical, linguistic, and logical definitions.<n>We generate two benchmark datasets that can be used to evaluate the performance of neural information retrieval models.<n>We propose a logic-based classification mechanism that can be used to analyze the performance of retrieval models on existing datasets.
arXiv Detail & Related papers (2025-07-30T02:44:20Z)
Examining False Positives under Inference Scaling for Mathematical Reasoning [59.19191774050967]
This paper systematically examines the prevalence of false positive solutions in mathematical problem solving for language models. We explore how false positives influence the inference time scaling behavior of language models.
arXiv Detail & Related papers (2025-02-10T07:49:35Z)
How Hard is this Test Set? NLI Characterization by Exploiting Training Dynamics [49.9329723199239]
We propose a method for the automated creation of a challenging test set without relying on the manual construction of artificial and unrealistic examples. We categorize the test set of popular NLI datasets into three difficulty levels by leveraging methods that exploit training dynamics. When our characterization method is applied to the training set, models trained with only a fraction of the data achieve comparable performance to those trained on the full dataset.
arXiv Detail & Related papers (2024-10-04T13:39:21Z)
Language models are not naysayers: An analysis of language models on negation benchmarks [58.32362243122714]
We evaluate the ability of current-generation auto-regressive language models to handle negation. We show that LLMs have several limitations including insensitivity to the presence of negation, an inability to capture the lexical semantics of negation, and a failure to reason under negation.
arXiv Detail & Related papers (2023-06-14T01:16:37Z)
Not another Negation Benchmark: The NaN-NLI Test Suite for Sub-clausal Negation [59.307534363825816]
Negation is poorly captured by current language models, although the extent of this problem is not widely understood. We introduce a natural language inference (NLI) test suite to enable probing the capabilities of NLP methods.
arXiv Detail & Related papers (2022-10-06T23:39:01Z)
Neural Causal Models for Counterfactual Identification and Estimation [62.30444687707919]
We study the evaluation of counterfactual statements through neural models. First, we show that neural causal models (NCMs) are expressive enough. Second, we develop an algorithm for simultaneously identifying and estimating counterfactual distributions.
arXiv Detail & Related papers (2022-09-30T18:29:09Z)
Decomposing Natural Logic Inferences in Neural NLI [9.606462437067984]
We investigate whether neural NLI models capture the crucial semantic features central to natural logic: monotonicity and concept inclusion. We find that monotonicity information is notably weak in the representations of popular NLI models which achieve high scores on benchmarks.
arXiv Detail & Related papers (2021-12-15T17:35:30Z)
Avoiding Inference Heuristics in Few-shot Prompt-based Finetuning [57.4036085386653]
We show that prompt-based models for sentence pair classification tasks still suffer from a common pitfall of adopting inferences based on lexical overlap. We then show that adding a regularization that preserves pretraining weights is effective in mitigating this destructive tendency of few-shot finetuning.
arXiv Detail & Related papers (2021-09-09T10:10:29Z)
Counterfactual Maximum Likelihood Estimation for Training Deep Networks [83.44219640437657]
Deep learning models are prone to learning spurious correlations that should not be learned as predictive clues. We propose a causality-based training framework to reduce the spurious correlations caused by observable confounders. We conduct experiments on two real-world tasks: Natural Language Inference (NLI) and Image Captioning.
arXiv Detail & Related papers (2021-06-07T17:47:16Z)
Causal Abstractions of Neural Networks [9.291492712301569]
We propose a new structural analysis method grounded in a formal theory of textitcausal abstraction. We apply this method to analyze neural models trained on Multiply Quantified Natural Language Inference (MQNLI) corpus.
arXiv Detail & Related papers (2021-06-06T01:07:43Z)
Exploring Transitivity in Neural NLI Models through Veridicality [39.845425535943534]
We focus on the transitivity of inference relations, a fundamental property for systematically drawing inferences. A model capturing transitivity can compose basic inference patterns and draw new inferences. We find that current NLI models do not perform consistently well on transitivity inference tasks.
arXiv Detail & Related papers (2021-01-26T11:18:35Z)
Exploring Lexical Irregularities in Hypothesis-Only Models of Natural Language Inference [5.283529004179579]
Natural Language Inference (NLI) or Recognizing Textual Entailment (RTE) is the task of predicting the entailment relation between a pair of sentences. Models that understand entailment should encode both, the premise and the hypothesis. Experiments by Poliak et al. revealed a strong preference of these models towards patterns observed only in the hypothesis.
arXiv Detail & Related papers (2021-01-19T01:08:06Z)
Do Neural Models Learn Systematicity of Monotonicity Inference in Natural Language? [41.649440404203595]
We introduce a method for evaluating whether neural models can learn systematicity of monotonicity inference in natural language. We consider four aspects of monotonicity inferences and test whether the models can systematically interpret lexical and logical phenomena on different training/test splits.
arXiv Detail & Related papers (2020-04-30T14:48:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.