KnowDis: Knowledge Enhanced Data Augmentation for Event Causality
Detection via Distant Supervision
- URL: http://arxiv.org/abs/2010.10833v1
- Date: Wed, 21 Oct 2020 08:44:54 GMT
- Title: KnowDis: Knowledge Enhanced Data Augmentation for Event Causality
Detection via Distant Supervision
- Authors: Xinyu Zuo, Yubo Chen, Kang Liu, Jun Zhao
- Abstract summary: We investigate a data augmentation framework for event causality detection (ECD) dubbed as Knowledge Enhanced Distant Data Augmentation (KnowDis)
KnowDis can augment available training data assisted with the lexical and causal commonsense knowledge for ECD via distant supervision.
Our method outperforms previous methods by a large margin assisted with automatically labeled training data.
- Score: 23.533310981207446
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern models of event causality detection (ECD) are mainly based on
supervised learning from small hand-labeled corpora. However, hand-labeled
training data is expensive to produce, low coverage of causal expressions and
limited in size, which makes supervised methods hard to detect causal relations
between events. To solve this data lacking problem, we investigate a data
augmentation framework for ECD, dubbed as Knowledge Enhanced Distant Data
Augmentation (KnowDis). Experimental results on two benchmark datasets
EventStoryLine corpus and Causal-TimeBank show that 1) KnowDis can augment
available training data assisted with the lexical and causal commonsense
knowledge for ECD via distant supervision, and 2) our method outperforms
previous methods by a large margin assisted with automatically labeled training
data.
Related papers
- Extracting Training Data from Unconditional Diffusion Models [76.85077961718875]
diffusion probabilistic models (DPMs) are being employed as mainstream models for generative artificial intelligence (AI)
We aim to establish a theoretical understanding of memorization in DPMs with 1) a memorization metric for theoretical analysis, 2) an analysis of conditional memorization with informative and random labels, and 3) two better evaluation metrics for measuring memorization.
Based on the theoretical analysis, we propose a novel data extraction method called textbfSurrogate condItional Data Extraction (SIDE) that leverages a trained on generated data as a surrogate condition to extract training data directly from unconditional diffusion models.
arXiv Detail & Related papers (2024-06-18T16:20:12Z) - Incremental Self-training for Semi-supervised Learning [56.57057576885672]
IST is simple yet effective and fits existing self-training-based semi-supervised learning methods.
We verify the proposed IST on five datasets and two types of backbone, effectively improving the recognition accuracy and learning speed.
arXiv Detail & Related papers (2024-04-14T05:02:00Z) - Weakly Supervised Anomaly Detection via Knowledge-Data Alignment [24.125871437370357]
Anomaly detection plays a pivotal role in numerous web-based applications, including malware detection, anti-money laundering, device failure detection, and network fault analysis.
Weakly Supervised Anomaly Detection (WSAD) has been introduced with a limited number of labeled anomaly samples to enhance model performance.
We introduce a novel framework Knowledge-Data Alignment (KDAlign) to integrate rule knowledge, typically summarized by human experts, to supplement the limited labeled data.
arXiv Detail & Related papers (2024-02-06T07:57:13Z) - Self-Supervised Learning for Data Scarcity in a Fatigue Damage
Prognostic Problem [0.0]
Self-Supervised Learning is a sub-category of unsupervised learning approaches.
This paper investigates whether pre-training DL models in a self-supervised way on unlabelled sensors data can be useful for Remaining Useful Life (RUL) estimation.
Results show that the self-supervised pre-trained models are able to significantly outperform the non-pre-trained models in downstream RUL prediction task.
arXiv Detail & Related papers (2023-01-20T06:45:32Z) - Deep Anomaly Detection and Search via Reinforcement Learning [22.005663849044772]
We propose Deep Anomaly Detection and Search (DADS) to balance exploitation and exploration.
During the training process, DADS searches for possible anomalies with hierarchically-structured datasets.
Results show that DADS can efficiently and precisely search anomalies from unlabeled data and learn from them.
arXiv Detail & Related papers (2022-08-31T13:03:33Z) - Incorporating Semi-Supervised and Positive-Unlabeled Learning for
Boosting Full Reference Image Quality Assessment [73.61888777504377]
Full-reference (FR) image quality assessment (IQA) evaluates the visual quality of a distorted image by measuring its perceptual difference with pristine-quality reference.
Unlabeled data can be easily collected from an image degradation or restoration process, making it encouraging to exploit unlabeled training data to boost FR-IQA performance.
In this paper, we suggest to incorporate semi-supervised and positive-unlabeled (PU) learning for exploiting unlabeled data while mitigating the adverse effect of outliers.
arXiv Detail & Related papers (2022-04-19T09:10:06Z) - Federated Causal Discovery [74.37739054932733]
This paper develops a gradient-based learning framework named DAG-Shared Federated Causal Discovery (DS-FCD)
It can learn the causal graph without directly touching local data and naturally handle the data heterogeneity.
Extensive experiments on both synthetic and real-world datasets verify the efficacy of the proposed method.
arXiv Detail & Related papers (2021-12-07T08:04:12Z) - Improving Event Causality Identification via Self-Supervised
Representation Learning on External Causal Statement [17.77752074834281]
We propose CauSeRL, which leverages external causal statements for event causality identification.
First of all, we design a self-supervised framework to learn context-specific causal patterns from external causal statements.
We adopt a contrastive transfer strategy to incorporate the learned context-specific causal patterns into the target ECI model.
arXiv Detail & Related papers (2021-06-03T07:50:50Z) - LearnDA: Learnable Knowledge-Guided Data Augmentation for Event
Causality Identification [17.77752074834281]
We introduce a new approach to augment training data for event causality identification.
Our approach is knowledge-guided, which can leverage existing knowledge bases to generate well-formed new sentences.
On the other hand, our approach employs a dual mechanism, which is a learnable augmentation framework and can interactively adjust the generation process to generate task-related sentences.
arXiv Detail & Related papers (2021-06-03T07:42:20Z) - Uncovering the structure of clinical EEG signals with self-supervised
learning [64.4754948595556]
Supervised learning paradigms are often limited by the amount of labeled data that is available.
This phenomenon is particularly problematic in clinically-relevant data, such as electroencephalography (EEG)
By extracting information from unlabeled data, it might be possible to reach competitive performance with deep neural networks.
arXiv Detail & Related papers (2020-07-31T14:34:47Z) - Provably Efficient Causal Reinforcement Learning with Confounded
Observational Data [135.64775986546505]
We study how to incorporate the dataset (observational data) collected offline, which is often abundantly available in practice, to improve the sample efficiency in the online setting.
We propose the deconfounded optimistic value iteration (DOVI) algorithm, which incorporates the confounded observational data in a provably efficient manner.
arXiv Detail & Related papers (2020-06-22T14:49:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.