Not another Negation Benchmark: The NaN-NLI Test Suite for Sub-clausal
Negation
- URL: http://arxiv.org/abs/2210.03256v1
- Date: Thu, 6 Oct 2022 23:39:01 GMT
- Title: Not another Negation Benchmark: The NaN-NLI Test Suite for Sub-clausal
Negation
- Authors: Hung Thinh Truong, Yulia Otmakhova, Timothy Baldwin, Trevor Cohn,
Karin Verspoor, Jey Han Lau
- Abstract summary: Negation is poorly captured by current language models, although the extent of this problem is not widely understood.
We introduce a natural language inference (NLI) test suite to enable probing the capabilities of NLP methods.
- Score: 59.307534363825816
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Negation is poorly captured by current language models, although the extent
of this problem is not widely understood. We introduce a natural language
inference (NLI) test suite to enable probing the capabilities of NLP methods,
with the aim of understanding sub-clausal negation. The test suite contains
premise--hypothesis pairs where the premise contains sub-clausal negation and
the hypothesis is constructed by making minimal modifications to the premise in
order to reflect different possible interpretations. Aside from adopting
standard NLI labels, our test suite is systematically constructed under a
rigorous linguistic framework. It includes annotation of negation types and
constructions grounded in linguistic theory, as well as the operations used to
construct hypotheses. This facilitates fine-grained analysis of model
performance. We conduct experiments using pre-trained language models to
demonstrate that our test suite is more challenging than existing benchmarks
focused on negation, and show how our annotation supports a deeper
understanding of the current NLI capabilities in terms of negation and
quantification.
Related papers
- Generating Diverse Negations from Affirmative Sentences [0.999726509256195]
Negations are important in real-world applications as they encode negative polarity in verb phrases, clauses, or other expressions.
We propose NegVerse, a method that tackles the lack of negation datasets by producing a diverse range of negation types.
We provide new rules for masking parts of sentences where negations are most likely to occur, based on syntactic structure.
We also propose a filtering mechanism to identify negation cues and remove degenerate examples, producing a diverse range of meaningful perturbations.
arXiv Detail & Related papers (2024-10-30T21:25:02Z) - The Self-Contained Negation Test Set [1.8749305679160366]
We build on Gubelmann and Handschuh (2022), which studies the modification of PLMs' predictions as a function of the polarity of inputs, in English.
This test uses self-contained'' inputs ending with a masked position.
We propose an improved version, the Self-Contained Neg Test, which is more controlled, more systematic, and entirely based on examples forming minimal pairs.
arXiv Detail & Related papers (2024-08-21T09:38:15Z) - Revisiting subword tokenization: A case study on affixal negation in large language models [57.75279238091522]
We measure the impact of affixal negation on modern English large language models (LLMs)
We conduct experiments using LLMs with different subword tokenization methods.
We show that models can, on the whole, reliably recognize the meaning of affixal negation.
arXiv Detail & Related papers (2024-04-03T03:14:27Z) - Language models are not naysayers: An analysis of language models on
negation benchmarks [58.32362243122714]
We evaluate the ability of current-generation auto-regressive language models to handle negation.
We show that LLMs have several limitations including insensitivity to the presence of negation, an inability to capture the lexical semantics of negation, and a failure to reason under negation.
arXiv Detail & Related papers (2023-06-14T01:16:37Z) - Simple Linguistic Inferences of Large Language Models (LLMs): Blind Spots and Blinds [59.71218039095155]
We evaluate language understanding capacities on simple inference tasks that most humans find trivial.
We target (i) grammatically-specified entailments, (ii) premises with evidential adverbs of uncertainty, and (iii) monotonicity entailments.
The models exhibit moderate to low performance on these evaluation sets.
arXiv Detail & Related papers (2023-05-24T06:41:09Z) - Improving negation detection with negation-focused pre-training [58.32362243122714]
Negation is a common linguistic feature that is crucial in many language understanding tasks.
Recent work has shown that state-of-the-art NLP models underperform on samples containing negation.
We propose a new negation-focused pre-training strategy, involving targeted data augmentation and negation masking.
arXiv Detail & Related papers (2022-05-09T02:41:11Z) - Understanding by Understanding Not: Modeling Negation in Language Models [81.21351681735973]
Negation is a core construction in natural language.
We propose to augment the language modeling objective with an unlikelihood objective that is based on negated generic sentences.
We reduce the mean top1 error rate to 4% on the negated LAMA dataset.
arXiv Detail & Related papers (2021-05-07T21:58:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.