HeadlineCause: A Dataset of News Headlines for Detecting Casualties
- URL: http://arxiv.org/abs/2108.12626v1
- Date: Sat, 28 Aug 2021 11:12:49 GMT
- Title: HeadlineCause: A Dataset of News Headlines for Detecting Casualties
- Authors: Ilya Gusev and Alexey Tikhonov
- Abstract summary: HeadlineCause is a dataset for detecting implicit causal relations between pairs of news headlines.
The dataset includes over 5000 headline pairs from English news and over 9000 headline pairs from Russian news labeled through crowdsourcing.
- Score: 0.20305676256390934
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Detecting implicit causal relations in texts is a task that requires both
common sense and world knowledge. Existing datasets are focused either on
commonsense causal reasoning or explicit causal relations. In this work, we
present HeadlineCause, a dataset for detecting implicit causal relations
between pairs of news headlines. The dataset includes over 5000 headline pairs
from English news and over 9000 headline pairs from Russian news labeled
through crowdsourcing. The pairs vary from totally unrelated or belonging to
the same general topic to the ones including causation and refutation
relations. We also present a set of models and experiments that demonstrates
the dataset validity, including a multilingual XLM-RoBERTa based model for
causality detection and a GPT-2 based model for possible effects prediction.
Related papers
- MAVEN-ERE: A Unified Large-scale Dataset for Event Coreference,
Temporal, Causal, and Subevent Relation Extraction [78.61546292830081]
We construct a large-scale human-annotated ERE dataset MAVEN-ERE with improved annotation schemes.
It contains 103,193 event coreference chains, 1,216,217 temporal relations, 57,992 causal relations, and 15,841 subevent relations.
Experiments show that ERE on MAVEN-ERE is quite challenging, and considering relation interactions with joint learning can improve performances.
arXiv Detail & Related papers (2022-11-14T13:34:49Z) - Uncovering Main Causalities for Long-tailed Information Extraction [14.39860866665021]
Long-tailed distributions caused by the selection bias of a dataset may lead to incorrect correlations.
This motivates us to propose counterfactual IE (CFIE), a novel framework that aims to uncover the main causalities behind data.
arXiv Detail & Related papers (2021-09-11T08:08:24Z) - Link Prediction on N-ary Relational Data Based on Relatedness Evaluation [61.61555159755858]
We propose a method called NaLP to conduct link prediction on n-ary relational data.
We represent each n-ary relational fact as a set of its role and role-value pairs.
Experimental results validate the effectiveness and merits of the proposed methods.
arXiv Detail & Related papers (2021-04-21T09:06:54Z) - Competency Problems: On Finding and Removing Artifacts in Language Data [50.09608320112584]
We argue that for complex language understanding tasks, all simple feature correlations are spurious.
We theoretically analyze the difficulty of creating data for competency problems when human bias is taken into account.
arXiv Detail & Related papers (2021-04-17T21:34:10Z) - Predicting Directionality in Causal Relations in Text [9.313899406300644]
SpanBERT performs better than BERT on causal samples with longer span length.
CREST is a framework for unifying a collection of scattered datasets of causal relations.
arXiv Detail & Related papers (2021-03-25T04:49:01Z) - Improving Commonsense Causal Reasoning by Adversarial Training and Data
Augmentation [14.92157586545743]
This paper presents a number of techniques for making models more robust in the domain of causal reasoning.
We show a statistically significant improvement on performance and on both datasets, even with only a small number of additionally generated data points.
arXiv Detail & Related papers (2021-01-13T09:55:29Z) - Causal BERT : Language models for causality detection between events
expressed in text [1.0756038762528868]
Causality understanding between events is helpful in many areas, including health care, business risk management and finance.
"Cause-Effect" relationships between natural language events continues to remain a challenge simply because it is often expressed implicitly.
Our proposed methods achieve the state-of-art performance in three different data distributions and can be leveraged for extraction of a causal diagram.
arXiv Detail & Related papers (2020-12-10T04:59:12Z) - Domain Adaptative Causality Encoder [52.779274858332656]
We leverage the characteristics of dependency trees and adversarial learning to address the tasks of adaptive causality identification and localisation.
We present a new causality dataset, namely MedCaus, which integrates all types of causality in the text.
arXiv Detail & Related papers (2020-11-27T04:14:55Z) - Learning to Decouple Relations: Few-Shot Relation Classification with
Entity-Guided Attention and Confusion-Aware Training [49.9995628166064]
We propose CTEG, a model equipped with two mechanisms to learn to decouple easily-confused relations.
On the one hand, an EGA mechanism is introduced to guide the attention to filter out information causing confusion.
On the other hand, a Confusion-Aware Training (CAT) method is proposed to explicitly learn to distinguish relations.
arXiv Detail & Related papers (2020-10-21T11:07:53Z) - Amortized Causal Discovery: Learning to Infer Causal Graphs from
Time-Series Data [63.15776078733762]
We propose Amortized Causal Discovery, a novel framework to learn to infer causal relations from time-series data.
We demonstrate experimentally that this approach, implemented as a variational model, leads to significant improvements in causal discovery performance.
arXiv Detail & Related papers (2020-06-18T19:59:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.