Improve Event Extraction via Self-Training with Gradient Guidance
- URL: http://arxiv.org/abs/2205.12490v2
- Date: Wed, 2 Aug 2023 06:21:28 GMT
- Title: Improve Event Extraction via Self-Training with Gradient Guidance
- Authors: Zhiyang Xu, Jay-Yoon Lee, Lifu Huang
- Abstract summary: We propose a Self-Training with Feedback (STF) framework to overcome the main factor that hinders the progress of event extraction.
STF consists of (1) a base event extraction model trained on existing event annotations and then applied to large-scale unlabeled corpora to predict new event mentions as pseudo training samples, and (2) a novel scoring model that takes in each new predicted event trigger, an argument, its argument role, as well as their paths in the AMR graph to estimate a compatibility score.
Experimental results on three benchmark datasets, including ACE05-E, ACE05-E+, and ERE
- Score: 10.618929821822892
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Data scarcity has been the main factor that hinders the progress of event
extraction. To overcome this issue, we propose a Self-Training with Feedback
(STF) framework that leverages the large-scale unlabeled data and acquires
feedback for each new event prediction from the unlabeled data by comparing it
to the Abstract Meaning Representation (AMR) graph of the same sentence.
Specifically, STF consists of (1) a base event extraction model trained on
existing event annotations and then applied to large-scale unlabeled corpora to
predict new event mentions as pseudo training samples, and (2) a novel scoring
model that takes in each new predicted event trigger, an argument, its argument
role, as well as their paths in the AMR graph to estimate a compatibility score
indicating the correctness of the pseudo label. The compatibility scores
further act as feedback to encourage or discourage the model learning on the
pseudo labels during self-training. Experimental results on three benchmark
datasets, including ACE05-E, ACE05-E+, and ERE, demonstrate the effectiveness
of the STF framework on event extraction, especially event argument extraction,
with significant performance gain over the base event extraction models and
strong baselines. Our experimental analysis further shows that STF is a generic
framework as it can be applied to improve most, if not all, event extraction
models by leveraging large-scale unlabeled data, even when high-quality AMR
graph annotations are not available.
Related papers
- Weak Reward Model Transforms Generative Models into Robust Causal Event Extraction Systems [17.10762463903638]
We train evaluation models to approximate human evaluation, achieving high agreement.
We propose a weak-to-strong supervision method that uses a fraction of the annotated data to train an evaluation model.
arXiv Detail & Related papers (2024-06-26T10:48:14Z) - Self-Training with Pseudo-Label Scorer for Aspect Sentiment Quad Prediction [54.23208041792073]
Aspect Sentiment Quad Prediction (ASQP) aims to predict all quads (aspect term, aspect category, opinion term, sentiment polarity) for a given review.
A key challenge in the ASQP task is the scarcity of labeled data, which limits the performance of existing methods.
We propose a self-training framework with a pseudo-label scorer, wherein a scorer assesses the match between reviews and their pseudo-labels.
arXiv Detail & Related papers (2024-06-26T05:30:21Z) - TRIAGE: Characterizing and auditing training data for improved
regression [80.11415390605215]
We introduce TRIAGE, a novel data characterization framework tailored to regression tasks and compatible with a broad class of regressors.
TRIAGE utilizes conformal predictive distributions to provide a model-agnostic scoring method, the TRIAGE score.
We show that TRIAGE's characterization is consistent and highlight its utility to improve performance via data sculpting/filtering, in multiple regression settings.
arXiv Detail & Related papers (2023-10-29T10:31:59Z) - Exploring the Limits of Historical Information for Temporal Knowledge
Graph Extrapolation [59.417443739208146]
We propose a new event forecasting model based on a novel training framework of historical contrastive learning.
CENET learns both the historical and non-historical dependency to distinguish the most potential entities.
We evaluate our proposed model on five benchmark graphs.
arXiv Detail & Related papers (2023-08-29T03:26:38Z) - Leveraging Instance Features for Label Aggregation in Programmatic Weak
Supervision [75.1860418333995]
Programmatic Weak Supervision (PWS) has emerged as a widespread paradigm to synthesize training labels efficiently.
The core component of PWS is the label model, which infers true labels by aggregating the outputs of multiple noisy supervision sources as labeling functions.
Existing statistical label models typically rely only on the outputs of LF, ignoring the instance features when modeling the underlying generative process.
arXiv Detail & Related papers (2022-10-06T07:28:53Z) - CEP3: Community Event Prediction with Neural Point Process on Graph [59.434777403325604]
We propose a novel model combining Graph Neural Networks and Marked Temporal Point Process (MTPP)
Our experiments demonstrate the superior performance of our model in terms of both model accuracy and training efficiency.
arXiv Detail & Related papers (2022-05-21T15:30:25Z) - ERGO: Event Relational Graph Transformer for Document-level Event
Causality Identification [24.894074201193927]
Event-level Event Causality Identification (DECI) aims to identify causal relations between event pairs in a document.
We propose a novel Graph TransfOrmer (ERGO) framework for DECI.
arXiv Detail & Related papers (2022-04-15T12:12:16Z) - Behind the Scenes: An Exploration of Trigger Biases Problem in Few-Shot
Event Classification [24.598938900747186]
Few-Shot Event Classification (FSEC) aims at developing a model for event prediction, which can generalize to new event types with a limited number of annotated data.
We find existing FSEC models suffer from trigger biases that signify the statistical homogeneity between some trigger words and target event types.
To cope with the context-bypassing problem in FSEC models, we introduce adversarial training and trigger reconstruction techniques.
arXiv Detail & Related papers (2021-08-29T13:46:42Z) - WSSOD: A New Pipeline for Weakly- and Semi-Supervised Object Detection [75.80075054706079]
We propose a weakly- and semi-supervised object detection framework (WSSOD)
An agent detector is first trained on a joint dataset and then used to predict pseudo bounding boxes on weakly-annotated images.
The proposed framework demonstrates remarkable performance on PASCAL-VOC and MSCOCO benchmark, achieving a high performance comparable to those obtained in fully-supervised settings.
arXiv Detail & Related papers (2021-05-21T11:58:50Z) - Back to Prior Knowledge: Joint Event Causality Extraction via
Convolutional Semantic Infusion [5.566928318239452]
Joint event and causality extraction is a challenging yet essential task in information retrieval and data mining.
We propose convolutional knowledge infusion for frequent n-grams with different windows of length within a joint extraction framework.
Our model significantly outperforms the strong BERT+CSNN baseline.
arXiv Detail & Related papers (2021-02-19T13:31:46Z) - Few-Shot Event Detection with Prototypical Amortized Conditional Random
Field [8.782210889586837]
Event Detection tends to struggle when it needs to recognize novel event types with a few samples.
We present a novel unified joint model which converts the task to a few-shot tagging problem with a double-part tagging scheme.
We conduct experiments on the benchmark dataset FewEvent and the experimental results show that the tagging based methods are better than existing pipeline and joint learning methods.
arXiv Detail & Related papers (2020-12-04T01:11:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.