UniCausal: Unified Benchmark and Repository for Causal Text Mining
- URL: http://arxiv.org/abs/2208.09163v2
- Date: Fri, 14 Apr 2023 09:02:50 GMT
- Title: UniCausal: Unified Benchmark and Repository for Causal Text Mining
- Authors: Fiona Anting Tan, Xinyu Zuo and See-Kiong Ng
- Abstract summary: We propose UniCausal, a unified benchmark for causal text mining across three tasks.
We consolidate and align annotations of six high quality, mainly human-annotated, corpora.
To create an initial benchmark, we fine-tuned BERT pre-trained language models to each task, achieving 70.10% Binary F1, 52.42% Macro F1, and 84.68% Binary F1 scores respectively.
- Score: 7.402967063220846
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Current causal text mining datasets vary in objectives, data coverage, and
annotation schemes. These inconsistent efforts prevent modeling capabilities
and fair comparisons of model performance. Furthermore, few datasets include
cause-effect span annotations, which are needed for end-to-end causal relation
extraction. To address these issues, we propose UniCausal, a unified benchmark
for causal text mining across three tasks: (I) Causal Sequence Classification,
(II) Cause-Effect Span Detection and (III) Causal Pair Classification. We
consolidated and aligned annotations of six high quality, mainly
human-annotated, corpora, resulting in a total of 58,720, 12,144 and 69,165
examples for each task respectively. Since the definition of causality can be
subjective, our framework was designed to allow researchers to work on some or
all datasets and tasks. To create an initial benchmark, we fine-tuned BERT
pre-trained language models to each task, achieving 70.10% Binary F1, 52.42%
Macro F1, and 84.68% Binary F1 scores respectively.
Related papers
- Causal Micro-Narratives [62.47217054314046]
We present a novel approach to classify causal micro-narratives from text.
These narratives are sentence-level explanations of the cause(s) and/or effect(s) of a target subject.
arXiv Detail & Related papers (2024-10-07T17:55:10Z) - A Unified Causal View of Instruction Tuning [76.1000380429553]
We develop a meta Structural Causal Model (meta-SCM) to integrate different NLP tasks under a single causal structure of the data.
Key idea is to learn task-required causal factors and only use those to make predictions for a given task.
arXiv Detail & Related papers (2024-02-09T07:12:56Z) - Towards Causal Foundation Model: on Duality between Causal Inference and Attention [18.046388712804042]
We take a first step towards building causally-aware foundation models for treatment effect estimations.
We propose a novel, theoretically justified method called Causal Inference with Attention (CInA)
arXiv Detail & Related papers (2023-10-01T22:28:34Z) - Inducing Causal Structure for Abstractive Text Summarization [76.1000380429553]
We introduce a Structural Causal Model (SCM) to induce the underlying causal structure of the summarization data.
We propose a Causality Inspired Sequence-to-Sequence model (CI-Seq2Seq) to learn the causal representations that can mimic the causal factors.
Experimental results on two widely used text summarization datasets demonstrate the advantages of our approach.
arXiv Detail & Related papers (2023-08-24T16:06:36Z) - A Cross-Domain Evaluation of Approaches for Causal Knowledge Extraction [12.558498579998862]
Causal knowledge extraction is the task of extracting relevant causes and effects from text by detecting the causal relation.
We perform a thorough analysis of three sequence tagging models for causal knowledge extraction and compare it with a span based approach to causality extraction.
Our experiments show that embeddings from pre-trained language models (e.g. BERT) provide a significant performance boost on this task.
arXiv Detail & Related papers (2023-08-07T19:50:59Z) - USB: A Unified Summarization Benchmark Across Tasks and Domains [68.82726887802856]
We introduce a Wikipedia-derived benchmark, complemented by a rich set of crowd-sourced annotations, that supports $8$ interrelated tasks.
We compare various methods on this benchmark and discover that on multiple tasks, moderately-sized fine-tuned models consistently outperform much larger few-shot prompted language models.
arXiv Detail & Related papers (2023-05-23T17:39:54Z) - Improving Commonsense Causal Reasoning by Adversarial Training and Data
Augmentation [14.92157586545743]
This paper presents a number of techniques for making models more robust in the domain of causal reasoning.
We show a statistically significant improvement on performance and on both datasets, even with only a small number of additionally generated data points.
arXiv Detail & Related papers (2021-01-13T09:55:29Z) - Domain Adaptative Causality Encoder [52.779274858332656]
We leverage the characteristics of dependency trees and adversarial learning to address the tasks of adaptive causality identification and localisation.
We present a new causality dataset, namely MedCaus, which integrates all types of causality in the text.
arXiv Detail & Related papers (2020-11-27T04:14:55Z) - Deep F-measure Maximization for End-to-End Speech Understanding [52.36496114728355]
We propose a differentiable approximation to the F-measure and train the network with this objective using standard backpropagation.
We perform experiments on two standard fairness datasets, Adult, Communities and Crime, and also on speech-to-intent detection on the ATIS dataset and speech-to-image concept classification on the Speech-COCO dataset.
In all four of these tasks, F-measure results in improved micro-F1 scores, with absolute improvements of up to 8% absolute, as compared to models trained with the cross-entropy loss function.
arXiv Detail & Related papers (2020-08-08T03:02:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.