Natural Language Inference in Context -- Investigating Contextual
Reasoning over Long Texts
- URL: http://arxiv.org/abs/2011.04864v1
- Date: Tue, 10 Nov 2020 02:31:31 GMT
- Title: Natural Language Inference in Context -- Investigating Contextual
Reasoning over Long Texts
- Authors: Hanmeng Liu, Leyang Cui, Jian Liu, Yue Zhang
- Abstract summary: ConTRoL is a new dataset for ConTextual Reasoning over Long texts.
It consists of 8,325 expert-designed "context-hypothesis" pairs with gold labels.
It is derived from competitive selection and recruitment test (verbal reasoning test) for police recruitment, with expert level quality.
- Score: 19.894104911338353
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Natural language inference (NLI) is a fundamental NLP task, investigating the
entailment relationship between two texts. Popular NLI datasets present the
task at sentence-level. While adequate for testing semantic representations,
they fall short for testing contextual reasoning over long texts, which is a
natural part of the human inference process. We introduce ConTRoL, a new
dataset for ConTextual Reasoning over Long texts. Consisting of 8,325
expert-designed "context-hypothesis" pairs with gold labels, ConTRoL is a
passage-level NLI dataset with a focus on complex contextual reasoning types
such as logical reasoning. It is derived from competitive selection and
recruitment test (verbal reasoning test) for police recruitment, with expert
level quality. Compared with previous NLI benchmarks, the materials in ConTRoL
are much more challenging, involving a range of reasoning types. Empirical
results show that state-of-the-art language models perform by far worse than
educated humans. Our dataset can also serve as a testing-set for downstream
tasks like Checking Factual Correctness of Summaries.
Related papers
- Surveying the Landscape of Text Summarization with Deep Learning: A
Comprehensive Review [2.4185510826808487]
Deep learning has revolutionized natural language processing (NLP) by enabling the development of models that can learn complex representations of language data.
Deep learning models for NLP typically use large amounts of data to train deep neural networks, allowing them to learn the patterns and relationships in language data.
Applying deep learning to text summarization refers to the use of deep neural networks to perform text summarization tasks.
arXiv Detail & Related papers (2023-10-13T21:24:37Z) - An Investigation of LLMs' Inefficacy in Understanding Converse Relations [30.94718664430869]
We introduce a new benchmark ConvRe focusing on converse relations, which contains 17 relations and 1240 triples extracted from knowledge graph completion datasets.
Our ConvRE features two tasks, Re2Text and Text2Re, which are formulated as multi-choice question answering to evaluate LLMs' ability to determine the matching between relations and associated text.
arXiv Detail & Related papers (2023-10-08T13:45:05Z) - MURMUR: Modular Multi-Step Reasoning for Semi-Structured Data-to-Text
Generation [102.20036684996248]
We propose MURMUR, a neuro-symbolic modular approach to text generation from semi-structured data with multi-step reasoning.
We conduct experiments on two data-to-text generation tasks like WebNLG and LogicNLG.
arXiv Detail & Related papers (2022-12-16T17:36:23Z) - LawngNLI: A Long-Premise Benchmark for In-Domain Generalization from
Short to Long Contexts and for Implication-Based Retrieval [72.4859717204905]
LawngNLI is constructed from U.S. legal opinions with automatic labels with high human-validated accuracy.
It can benchmark for in-domain generalization from short to long contexts.
LawngNLI can train and test systems for implication-based case retrieval and argumentation.
arXiv Detail & Related papers (2022-12-06T18:42:39Z) - An Inclusive Notion of Text [69.36678873492373]
We argue that clarity on the notion of text is crucial for reproducible and generalizable NLP.
We introduce a two-tier taxonomy of linguistic and non-linguistic elements that are available in textual sources and can be used in NLP modeling.
arXiv Detail & Related papers (2022-11-10T14:26:43Z) - Full-Text Argumentation Mining on Scientific Publications [3.8754200816873787]
We introduce a sequential pipeline model combining ADUR and ARE for full-text SAM.
We provide a first analysis of the performance of pretrained language models (PLMs) on both subtasks.
Our detailed error analysis reveals that non-contiguous ADUs as well as the interpretation of discourse connectors pose major challenges.
arXiv Detail & Related papers (2022-10-24T10:05:30Z) - SCROLLS: Standardized CompaRison Over Long Language Sequences [62.574959194373264]
We introduce SCROLLS, a suite of tasks that require reasoning over long texts.
SCROLLS contains summarization, question answering, and natural language inference tasks.
We make all datasets available in a unified text-to-text format and host a live leaderboard to facilitate research on model architecture and pretraining methods.
arXiv Detail & Related papers (2022-01-10T18:47:15Z) - Toward the Understanding of Deep Text Matching Models for Information
Retrieval [72.72380690535766]
This paper aims at testing whether existing deep text matching methods satisfy some fundamental gradients in information retrieval.
Specifically, four attributions are used in our study, i.e., term frequency constraint, term discrimination constraint, length normalization constraints, and TF-length constraint.
Experimental results on LETOR 4.0 and MS Marco show that all the investigated deep text matching methods satisfy the above constraints with high probabilities in statistics.
arXiv Detail & Related papers (2021-08-16T13:33:15Z) - DocNLI: A Large-scale Dataset for Document-level Natural Language
Inference [55.868482696821815]
Natural language inference (NLI) is formulated as a unified framework for solving various NLP problems.
This work presents DocNLI -- a newly-constructed large-scale dataset for document-level NLI.
arXiv Detail & Related papers (2021-06-17T13:02:26Z) - Looking Beyond Sentence-Level Natural Language Inference for Downstream
Tasks [15.624486319943015]
In recent years, the Natural Language Inference (NLI) task has garnered significant attention.
We study this unfulfilled promise from the lens of two downstream tasks: question answering (QA), and text summarization.
We conjecture that a key difference between the NLI datasets and these downstream tasks concerns the length of the premise.
arXiv Detail & Related papers (2020-09-18T21:44:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.