Neural Natural Language Inference Models Partially Embed Theories of
Lexical Entailment and Negation
- URL: http://arxiv.org/abs/2004.14623v4
- Date: Sat, 21 Nov 2020 01:53:49 GMT
- Title: Neural Natural Language Inference Models Partially Embed Theories of
Lexical Entailment and Negation
- Authors: Atticus Geiger, Kyle Richardson, and Christopher Potts
- Abstract summary: We present Monotonicity NLI (MoNLI), a new naturalistic dataset focused on lexical entailment and negation.
In behavioral evaluations, we find that models trained on general-purpose NLI datasets fail systematically on MoNLI examples containing negation.
In structural evaluations, we look for evidence that our top-performing BERT-based model has learned to implement the monotonicity algorithm behind MoNLI.
- Score: 14.431925736607043
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We address whether neural models for Natural Language Inference (NLI) can
learn the compositional interactions between lexical entailment and negation,
using four methods: the behavioral evaluation methods of (1) challenge test
sets and (2) systematic generalization tasks, and the structural evaluation
methods of (3) probes and (4) interventions. To facilitate this holistic
evaluation, we present Monotonicity NLI (MoNLI), a new naturalistic dataset
focused on lexical entailment and negation. In our behavioral evaluations, we
find that models trained on general-purpose NLI datasets fail systematically on
MoNLI examples containing negation, but that MoNLI fine-tuning addresses this
failure. In our structural evaluations, we look for evidence that our
top-performing BERT-based model has learned to implement the monotonicity
algorithm behind MoNLI. Probes yield evidence consistent with this conclusion,
and our intervention experiments bolster this, showing that the causal dynamics
of the model mirror the causal dynamics of this algorithm on subsets of MoNLI.
This suggests that the BERT model at least partially embeds a theory of lexical
entailment and negation at an algorithmic level.
Related papers
- How Hard is this Test Set? NLI Characterization by Exploiting Training Dynamics [49.9329723199239]
We propose a method for the automated creation of a challenging test set without relying on the manual construction of artificial and unrealistic examples.
We categorize the test set of popular NLI datasets into three difficulty levels by leveraging methods that exploit training dynamics.
When our characterization method is applied to the training set, models trained with only a fraction of the data achieve comparable performance to those trained on the full dataset.
arXiv Detail & Related papers (2024-10-04T13:39:21Z) - Language models are not naysayers: An analysis of language models on
negation benchmarks [58.32362243122714]
We evaluate the ability of current-generation auto-regressive language models to handle negation.
We show that LLMs have several limitations including insensitivity to the presence of negation, an inability to capture the lexical semantics of negation, and a failure to reason under negation.
arXiv Detail & Related papers (2023-06-14T01:16:37Z) - Not another Negation Benchmark: The NaN-NLI Test Suite for Sub-clausal
Negation [59.307534363825816]
Negation is poorly captured by current language models, although the extent of this problem is not widely understood.
We introduce a natural language inference (NLI) test suite to enable probing the capabilities of NLP methods.
arXiv Detail & Related papers (2022-10-06T23:39:01Z) - Neural Causal Models for Counterfactual Identification and Estimation [62.30444687707919]
We study the evaluation of counterfactual statements through neural models.
First, we show that neural causal models (NCMs) are expressive enough.
Second, we develop an algorithm for simultaneously identifying and estimating counterfactual distributions.
arXiv Detail & Related papers (2022-09-30T18:29:09Z) - Decomposing Natural Logic Inferences in Neural NLI [9.606462437067984]
We investigate whether neural NLI models capture the crucial semantic features central to natural logic: monotonicity and concept inclusion.
We find that monotonicity information is notably weak in the representations of popular NLI models which achieve high scores on benchmarks.
arXiv Detail & Related papers (2021-12-15T17:35:30Z) - Avoiding Inference Heuristics in Few-shot Prompt-based Finetuning [57.4036085386653]
We show that prompt-based models for sentence pair classification tasks still suffer from a common pitfall of adopting inferences based on lexical overlap.
We then show that adding a regularization that preserves pretraining weights is effective in mitigating this destructive tendency of few-shot finetuning.
arXiv Detail & Related papers (2021-09-09T10:10:29Z) - Counterfactual Maximum Likelihood Estimation for Training Deep Networks [83.44219640437657]
Deep learning models are prone to learning spurious correlations that should not be learned as predictive clues.
We propose a causality-based training framework to reduce the spurious correlations caused by observable confounders.
We conduct experiments on two real-world tasks: Natural Language Inference (NLI) and Image Captioning.
arXiv Detail & Related papers (2021-06-07T17:47:16Z) - Causal Abstractions of Neural Networks [9.291492712301569]
We propose a new structural analysis method grounded in a formal theory of textitcausal abstraction.
We apply this method to analyze neural models trained on Multiply Quantified Natural Language Inference (MQNLI) corpus.
arXiv Detail & Related papers (2021-06-06T01:07:43Z) - Exploring Transitivity in Neural NLI Models through Veridicality [39.845425535943534]
We focus on the transitivity of inference relations, a fundamental property for systematically drawing inferences.
A model capturing transitivity can compose basic inference patterns and draw new inferences.
We find that current NLI models do not perform consistently well on transitivity inference tasks.
arXiv Detail & Related papers (2021-01-26T11:18:35Z) - Exploring Lexical Irregularities in Hypothesis-Only Models of Natural
Language Inference [5.283529004179579]
Natural Language Inference (NLI) or Recognizing Textual Entailment (RTE) is the task of predicting the entailment relation between a pair of sentences.
Models that understand entailment should encode both, the premise and the hypothesis.
Experiments by Poliak et al. revealed a strong preference of these models towards patterns observed only in the hypothesis.
arXiv Detail & Related papers (2021-01-19T01:08:06Z) - Do Neural Models Learn Systematicity of Monotonicity Inference in
Natural Language? [41.649440404203595]
We introduce a method for evaluating whether neural models can learn systematicity of monotonicity inference in natural language.
We consider four aspects of monotonicity inferences and test whether the models can systematically interpret lexical and logical phenomena on different training/test splits.
arXiv Detail & Related papers (2020-04-30T14:48:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.