Related papers: Stretching Sentence-pair NLI Models to Reason over Long Documents and Clusters

Stretching Sentence-pair NLI Models to Reason over Long Documents and Clusters

URL: http://arxiv.org/abs/2204.07447v1
Date: Fri, 15 Apr 2022 12:56:39 GMT
Title: Stretching Sentence-pair NLI Models to Reason over Long Documents and Clusters
Authors: Tal Schuster, Sihao Chen, Senaka Buthpitiya, Alex Fabrikant, Donald Metzler
Abstract summary: Natural Language Inference (NLI) has been extensively studied by the NLP community as a framework for estimating the semantic relation between sentence pairs. We explore the direct zero-shot applicability of NLI models to real applications, beyond the sentence-pair setting they were trained on. We develop new aggregation methods to allow operating over full documents, reaching state-of-the-art performance on the ContractNLI dataset.
Score: 35.103851212995046
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Natural Language Inference (NLI) has been extensively studied by the NLP community as a framework for estimating the semantic relation between sentence pairs. While early work identified certain biases in NLI models, recent advancements in modeling and datasets demonstrated promising performance. In this work, we further explore the direct zero-shot applicability of NLI models to real applications, beyond the sentence-pair setting they were trained on. First, we analyze the robustness of these models to longer and out-of-domain inputs. Then, we develop new aggregation methods to allow operating over full documents, reaching state-of-the-art performance on the ContractNLI dataset. Interestingly, we find NLI scores to provide strong retrieval signals, leading to more relevant evidence extractions compared to common similarity-based methods. Finally, we go further and investigate whole document clusters to identify both discrepancies and consensus among sources. In a test case, we find real inconsistencies between Wikipedia pages in different languages about the same topic.

Related papers

FactCG: Enhancing Fact Checkers with Graph-Based Multi-Hop Data [13.108807408880645]
We propose a novel approach for synthetic data generation, CG2C, that leverages multi-hop reasoning on context graphs extracted from documents. Our fact checker model, FactCG, demonstrates improved performance with more connected reasoning, using the same backbone models.
arXiv Detail & Related papers (2025-01-28T18:45:07Z)
Fast and Accurate Factual Inconsistency Detection Over Long Documents [19.86348214462828]
We introduce SCALE, a task-agnostic model for detecting factual inconsistencies using a novel chunking strategy. This approach achieves state-of-the-art performance in factual inconsistency detection for diverse tasks and long inputs. We have released our code and data publicly to GitHub.
arXiv Detail & Related papers (2023-10-19T22:55:39Z)
With a Little Push, NLI Models can Robustly and Efficiently Predict Faithfulness [19.79160738554967]
Conditional language models still generate unfaithful output that is not supported by their input. We show that pure NLI models can outperform more complex metrics when combining task-adaptive data augmentation with robust inference procedures.
arXiv Detail & Related papers (2023-05-26T11:00:04Z)
LawngNLI: A Long-Premise Benchmark for In-Domain Generalization from Short to Long Contexts and for Implication-Based Retrieval [72.4859717204905]
LawngNLI is constructed from U.S. legal opinions with automatic labels with high human-validated accuracy. It can benchmark for in-domain generalization from short to long contexts. LawngNLI can train and test systems for implication-based case retrieval and argumentation.
arXiv Detail & Related papers (2022-12-06T18:42:39Z)
Falsesum: Generating Document-level NLI Examples for Recognizing Factual Inconsistency in Summarization [63.21819285337555]
We show that NLI models can be effective for this task when the training data is augmented with high-quality task-oriented examples. We introduce Falsesum, a data generation pipeline leveraging a controllable text generation model to perturb human-annotated summaries. We show that models trained on a Falsesum-augmented NLI dataset improve the state-of-the-art performance across four benchmarks for detecting factual inconsistency in summarization.
arXiv Detail & Related papers (2022-05-12T10:43:42Z)
SummaC: Re-Visiting NLI-based Models for Inconsistency Detection in Summarization [27.515873862013006]
Key requirement for summaries is to be factually consistent with the input document. Previous work has found that natural language inference models do not perform competitively when applied to inconsistency detection. We provide a highly effective and light-weight method called SummaCConv that enables NLI models to be successfully used for this task.
arXiv Detail & Related papers (2021-11-18T05:02:31Z)
Efficient Nearest Neighbor Language Models [114.40866461741795]
Non-parametric neural language models (NLMs) learn predictive distributions of text utilizing an external datastore. We show how to achieve up to a 6x speed-up in inference speed while retaining comparable performance.
arXiv Detail & Related papers (2021-09-09T12:32:28Z)
DocNLI: A Large-scale Dataset for Document-level Natural Language Inference [55.868482696821815]
Natural language inference (NLI) is formulated as a unified framework for solving various NLP problems. This work presents DocNLI -- a newly-constructed large-scale dataset for document-level NLI.
arXiv Detail & Related papers (2021-06-17T13:02:26Z)
Reliable Evaluations for Natural Language Inference based on a Unified Cross-dataset Benchmark [54.782397511033345]
Crowd-sourced Natural Language Inference (NLI) datasets may suffer from significant biases like annotation artifacts. We present a new unified cross-datasets benchmark with 14 NLI datasets and re-evaluate 9 widely-used neural network-based NLI models. Our proposed evaluation scheme and experimental baselines could provide a basis to inspire future reliable NLI research.
arXiv Detail & Related papers (2020-10-15T11:50:12Z)
Reading Comprehension as Natural Language Inference: A Semantic Analysis [15.624486319943015]
We explore the utility of Natural language Inference (NLI) for Question Answering (QA) We transform the one of the largest available MRC dataset (RACE) to an NLI form, and compare the performances of a state-of-the-art model (RoBERTa) on both forms. We highlight clear categories for which the model is able to perform better when the data is presented in a coherent entailment form, and a structured question-answer concatenation form.
arXiv Detail & Related papers (2020-10-04T22:50:59Z)
Coreferential Reasoning Learning for Language Representation [88.14248323659267]
We present CorefBERT, a novel language representation model that can capture the coreferential relations in context. The experimental results show that, compared with existing baseline models, CorefBERT can achieve significant improvements consistently on various downstream NLP tasks.
arXiv Detail & Related papers (2020-04-15T03:57:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.