DocNLI: A Large-scale Dataset for Document-level Natural Language
Inference
- URL: http://arxiv.org/abs/2106.09449v1
- Date: Thu, 17 Jun 2021 13:02:26 GMT
- Title: DocNLI: A Large-scale Dataset for Document-level Natural Language
Inference
- Authors: Wenpeng Yin, Dragomir Radev, Caiming Xiong
- Abstract summary: Natural language inference (NLI) is formulated as a unified framework for solving various NLP problems.
This work presents DocNLI -- a newly-constructed large-scale dataset for document-level NLI.
- Score: 55.868482696821815
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Natural language inference (NLI) is formulated as a unified framework for
solving various NLP problems such as relation extraction, question answering,
summarization, etc. It has been studied intensively in the past few years
thanks to the availability of large-scale labeled datasets. However, most
existing studies focus on merely sentence-level inference, which limits the
scope of NLI's application in downstream NLP problems. This work presents
DocNLI -- a newly-constructed large-scale dataset for document-level NLI.
DocNLI is transformed from a broad range of NLP problems and covers multiple
genres of text. The premises always stay in the document granularity, whereas
the hypotheses vary in length from single sentences to passages with hundreds
of words. Additionally, DocNLI has pretty limited artifacts which unfortunately
widely exist in some popular sentence-level NLI datasets. Our experiments
demonstrate that, even without fine-tuning, a model pretrained on DocNLI shows
promising performance on popular sentence-level benchmarks, and generalizes
well to out-of-domain NLP tasks that rely on inference at document granularity.
Task-specific fine-tuning can bring further improvements. Data, code, and
pretrained models can be found at https://github.com/salesforce/DocNLI.
Related papers
- MSciNLI: A Diverse Benchmark for Scientific Natural Language Inference [65.37685198688538]
This paper presents MSciNLI, a dataset containing 132,320 sentence pairs extracted from five new scientific domains.
We establish strong baselines on MSciNLI by fine-tuning Pre-trained Language Models (PLMs) and prompting Large Language Models (LLMs)
We show that domain shift degrades the performance of scientific NLI models which demonstrates the diverse characteristics of different domains in our dataset.
arXiv Detail & Related papers (2024-04-11T18:12:12Z) - LawngNLI: A Long-Premise Benchmark for In-Domain Generalization from
Short to Long Contexts and for Implication-Based Retrieval [72.4859717204905]
LawngNLI is constructed from U.S. legal opinions with automatic labels with high human-validated accuracy.
It can benchmark for in-domain generalization from short to long contexts.
LawngNLI can train and test systems for implication-based case retrieval and argumentation.
arXiv Detail & Related papers (2022-12-06T18:42:39Z) - Learning to Infer from Unlabeled Data: A Semi-supervised Learning
Approach for Robust Natural Language Inference [47.293189105900524]
Natural Language Inference (NLI) aims at predicting the relation between a pair of sentences (premise and hypothesis) as entailment, contradiction or semantic independence.
Deep learning models have shown promising performance for NLI in recent years, they rely on large scale expensive human-annotated datasets.
Semi-supervised learning (SSL) is a popular technique for reducing the reliance on human annotation by leveraging unlabeled data for training.
arXiv Detail & Related papers (2022-11-05T20:34:08Z) - Few-Shot Document-Level Event Argument Extraction [2.680014762694412]
Event argument extraction (EAE) has been well studied at the sentence level but under-explored at the document level.
We present FewDocAE, a Few-Shot Document-Level Event Argument Extraction benchmark.
arXiv Detail & Related papers (2022-09-06T03:57:23Z) - Falsesum: Generating Document-level NLI Examples for Recognizing Factual
Inconsistency in Summarization [63.21819285337555]
We show that NLI models can be effective for this task when the training data is augmented with high-quality task-oriented examples.
We introduce Falsesum, a data generation pipeline leveraging a controllable text generation model to perturb human-annotated summaries.
We show that models trained on a Falsesum-augmented NLI dataset improve the state-of-the-art performance across four benchmarks for detecting factual inconsistency in summarization.
arXiv Detail & Related papers (2022-05-12T10:43:42Z) - Stretching Sentence-pair NLI Models to Reason over Long Documents and
Clusters [35.103851212995046]
Natural Language Inference (NLI) has been extensively studied by the NLP community as a framework for estimating the semantic relation between sentence pairs.
We explore the direct zero-shot applicability of NLI models to real applications, beyond the sentence-pair setting they were trained on.
We develop new aggregation methods to allow operating over full documents, reaching state-of-the-art performance on the ContractNLI dataset.
arXiv Detail & Related papers (2022-04-15T12:56:39Z) - Looking Beyond Sentence-Level Natural Language Inference for Downstream
Tasks [15.624486319943015]
In recent years, the Natural Language Inference (NLI) task has garnered significant attention.
We study this unfulfilled promise from the lens of two downstream tasks: question answering (QA), and text summarization.
We conjecture that a key difference between the NLI datasets and these downstream tasks concerns the length of the premise.
arXiv Detail & Related papers (2020-09-18T21:44:35Z) - Coreferential Reasoning Learning for Language Representation [88.14248323659267]
We present CorefBERT, a novel language representation model that can capture the coreferential relations in context.
The experimental results show that, compared with existing baseline models, CorefBERT can achieve significant improvements consistently on various downstream NLP tasks.
arXiv Detail & Related papers (2020-04-15T03:57:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.