Related papers: CONDAQA: A Contrastive Reading Comprehension Dataset for Reasoning about Negation

CONDAQA: A Contrastive Reading Comprehension Dataset for Reasoning about Negation

URL: http://arxiv.org/abs/2211.00295v1
Date: Tue, 1 Nov 2022 06:10:26 GMT
Title: CONDAQA: A Contrastive Reading Comprehension Dataset for Reasoning about Negation
Authors: Abhilasha Ravichander, Matt Gardner, Ana Marasovi\'c
Abstract summary: We present the first English reading comprehension dataset which requires reasoning about the implications of negated statements in paragraphs. CONDAQA features 14,182 question-answer pairs with over 200 unique negation cues. The best performing model on CONDAQA (UnifiedQA-v2-3b) achieves only 42% on our consistency metric, well below human performance which is 81%.
Score: 21.56001677478673
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The full power of human language-based communication cannot be realized without negation. All human languages have some form of negation. Despite this, negation remains a challenging phenomenon for current natural language understanding systems. To facilitate the future development of models that can process negation effectively, we present CONDAQA, the first English reading comprehension dataset which requires reasoning about the implications of negated statements in paragraphs. We collect paragraphs with diverse negation cues, then have crowdworkers ask questions about the implications of the negated statement in the passage. We also have workers make three kinds of edits to the passage -- paraphrasing the negated statement, changing the scope of the negation, and reversing the negation -- resulting in clusters of question-answer pairs that are difficult for models to answer with spurious shortcuts. CONDAQA features 14,182 question-answer pairs with over 200 unique negation cues and is challenging for current state-of-the-art models. The best performing model on CONDAQA (UnifiedQA-v2-3b) achieves only 42% on our consistency metric, well below human performance which is 81%. We release our dataset, along with fully-finetuned, few-shot, and zero-shot evaluations, to facilitate the development of future NLP methods that work on negated language.

Related papers

Negation: A Pink Elephant in the Large Language Models' Room? [2.8078480738404]
Negations are key to determining sentence meaning, making them essential for logical reasoning. We investigate how model size and language impact its ability to handle negation correctly by evaluating popular language models. Our datasets can facilitate further research and improvements of language model reasoning in multilingual settings.
arXiv Detail & Related papers (2025-03-28T13:04:41Z)
Making Language Models Robust Against Negation [9.818585902859363]
We propose a self-supervised method to make language models more robust against negation. We show that BERT and RoBERTa further pre-trained on our tasks outperform the off-the-shelf versions on nine negation-related benchmarks.
arXiv Detail & Related papers (2025-02-11T17:18:47Z)
Vision-Language Models Do Not Understand Negation [50.27667000027403]
NegBench is a benchmark designed to evaluate negation understanding across 18 task variations and 79k examples spanning image, video, and medical datasets. We show that this approach can result in a 10% increase in recall on negated queries and a 40% boost in accuracy on multiple-choice questions with negated captions.
arXiv Detail & Related papers (2025-01-16T09:55:42Z)
Generating Diverse Negations from Affirmative Sentences [0.999726509256195]
Negations are important in real-world applications as they encode negative polarity in verb phrases, clauses, or other expressions. We propose NegVerse, a method that tackles the lack of negation datasets by producing a diverse range of negation types. We provide new rules for masking parts of sentences where negations are most likely to occur, based on syntactic structure. We also propose a filtering mechanism to identify negation cues and remove degenerate examples, producing a diverse range of meaningful perturbations.
arXiv Detail & Related papers (2024-10-30T21:25:02Z)
Paraphrasing in Affirmative Terms Improves Negation Understanding [9.818585902859363]
Negation is a common linguistic phenomenon. We show improvements with CondaQA, a large corpus requiring reasoning with negation, and five natural language understanding tasks.
arXiv Detail & Related papers (2024-06-11T17:30:03Z)
Revisiting subword tokenization: A case study on affixal negation in large language models [57.75279238091522]
We measure the impact of affixal negation on modern English large language models (LLMs) We conduct experiments using LLMs with different subword tokenization methods. We show that models can, on the whole, reliably recognize the meaning of affixal negation.
arXiv Detail & Related papers (2024-04-03T03:14:27Z)
This is not a Dataset: A Large Negation Benchmark to Challenge Large Language Models [4.017326849033009]
We try to clarify the reasons for the sub-optimal performance of large language models understanding negation. We introduce a large semi-automatically generated dataset of circa 400,000 descriptive sentences about commonsense knowledge. We have used our dataset with the largest available open LLMs in a zero-shot approach to grasp their generalization and inference capability.
arXiv Detail & Related papers (2023-10-24T15:38:21Z)
We're Afraid Language Models Aren't Modeling Ambiguity [136.8068419824318]
Managing ambiguity is a key part of human language understanding. We characterize ambiguity in a sentence by its effect on entailment relations with another sentence. We show that a multilabel NLI model can flag political claims in the wild that are misleading due to ambiguity.
arXiv Detail & Related papers (2023-04-27T17:57:58Z)
Not another Negation Benchmark: The NaN-NLI Test Suite for Sub-clausal Negation [59.307534363825816]
Negation is poorly captured by current language models, although the extent of this problem is not widely understood. We introduce a natural language inference (NLI) test suite to enable probing the capabilities of NLP methods.
arXiv Detail & Related papers (2022-10-06T23:39:01Z)
Improving negation detection with negation-focused pre-training [58.32362243122714]
Negation is a common linguistic feature that is crucial in many language understanding tasks. Recent work has shown that state-of-the-art NLP models underperform on samples containing negation. We propose a new negation-focused pre-training strategy, involving targeted data augmentation and negation masking.
arXiv Detail & Related papers (2022-05-09T02:41:11Z)
Understanding by Understanding Not: Modeling Negation in Language Models [81.21351681735973]
Negation is a core construction in natural language. We propose to augment the language modeling objective with an unlikelihood objective that is based on negated generic sentences. We reduce the mean top1 error rate to 4% on the negated LAMA dataset.
arXiv Detail & Related papers (2021-05-07T21:58:35Z)
On Negative Interference in Multilingual Models: Findings and A Meta-Learning Treatment [59.995385574274785]
We show that, contrary to previous belief, negative interference also impacts low-resource languages. We present a meta-learning algorithm that obtains better cross-lingual transferability and alleviates negative interference.
arXiv Detail & Related papers (2020-10-06T20:48:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.