A Generative Approach for Script Event Prediction via Contrastive
Fine-tuning
- URL: http://arxiv.org/abs/2212.03496v3
- Date: Fri, 9 Dec 2022 06:34:26 GMT
- Title: A Generative Approach for Script Event Prediction via Contrastive
Fine-tuning
- Authors: Fangqi Zhu, Jun Gao, Changlong Yu, Wei Wang, Chen Xu, Xin Mu, Min
Yang, Ruifeng Xu
- Abstract summary: Script event prediction aims to predict the subsequent event given the context.
Recent works have attempted to improve event correlation reasoning by using pretrained language models and incorporating external knowledge.
We propose a novel generative approach for this task, in which a pretrained language model is fine-tuned with an event-centric pretraining objective.
- Score: 35.87615178251874
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Script event prediction aims to predict the subsequent event given the
context. This requires the capability to infer the correlations between events.
Recent works have attempted to improve event correlation reasoning by using
pretrained language models and incorporating external knowledge~(e.g.,
discourse relations). Though promising results have been achieved, some
challenges still remain. First, the pretrained language models adopted by
current works ignore event-level knowledge, resulting in an inability to
capture the correlations between events well. Second, modeling correlations
between events with discourse relations is limited because it can only capture
explicit correlations between events with discourse markers, and cannot capture
many implicit correlations. To this end, we propose a novel generative approach
for this task, in which a pretrained language model is fine-tuned with an
event-centric pretraining objective and predicts the next event within a
generative paradigm. Specifically, we first introduce a novel event-level blank
infilling strategy as the learning objective to inject event-level knowledge
into the pretrained language model, and then design a likelihood-based
contrastive loss for fine-tuning the generative model. Instead of using an
additional prediction layer, we perform prediction by using sequence
likelihoods generated by the generative model. Our approach models correlations
between events in a soft way without any external knowledge. The
likelihood-based prediction eliminates the need to use additional networks to
make predictions and is somewhat interpretable since it scores each word in the
event. Experimental results on the multi-choice narrative cloze~(MCNC) task
demonstrate that our approach achieves better results than other
state-of-the-art baselines. Our code will be available at
https://github.com/zhufq00/mcnc.
Related papers
- Semantic Pivoting Model for Effective Event Detection [19.205550116466604]
Event Detection aims to identify and classify mentions of event instances from unstructured articles.
Existing techniques for event detection only use homogeneous one-hot vectors to represent the event type classes, ignoring the fact that the semantic meaning of the types is important to the task.
We propose a Semantic Pivoting Model for Effective Event Detection (SPEED), which explicitly incorporates prior information during training and captures semantically meaningful correlations between input and events.
arXiv Detail & Related papers (2022-11-01T19:20:34Z) - Towards Out-of-Distribution Sequential Event Prediction: A Causal
Treatment [72.50906475214457]
The goal of sequential event prediction is to estimate the next event based on a sequence of historical events.
In practice, the next-event prediction models are trained with sequential data collected at one time.
We propose a framework with hierarchical branching structures for learning context-specific representations.
arXiv Detail & Related papers (2022-10-24T07:54:13Z) - Unifying Event Detection and Captioning as Sequence Generation via
Pre-Training [53.613265415703815]
We propose a unified pre-training and fine-tuning framework to enhance the inter-task association between event detection and captioning.
Our model outperforms the state-of-the-art methods, and can be further boosted when pre-trained on extra large-scale video-text data.
arXiv Detail & Related papers (2022-07-18T14:18:13Z) - A Graph Enhanced BERT Model for Event Prediction [35.02248467245135]
We consider automatically building of event graph using a BERT model.
We incorporate an additional structured variable into BERT to learn to predict the event connections in the training process.
Results on two event prediction tasks: script event prediction and story ending prediction, show that our approach can outperform state-of-the-art baseline methods.
arXiv Detail & Related papers (2022-05-22T13:37:38Z) - A Generative Language Model for Few-shot Aspect-Based Sentiment Analysis [90.24921443175514]
We focus on aspect-based sentiment analysis, which involves extracting aspect term, category, and predicting their corresponding polarities.
We propose to reformulate the extraction and prediction tasks into the sequence generation task, using a generative language model with unidirectional attention.
Our approach outperforms the previous state-of-the-art (based on BERT) on average performance by a large margins in few-shot and full-shot settings.
arXiv Detail & Related papers (2022-04-11T18:31:53Z) - ClarET: Pre-training a Correlation-Aware Context-To-Event Transformer
for Event-Centric Generation and Classification [74.6318379374801]
We propose to pre-train a general Correlation-aware context-to-Event Transformer (ClarET) for event-centric reasoning.
The proposed ClarET is applicable to a wide range of event-centric reasoning scenarios.
arXiv Detail & Related papers (2022-03-04T10:11:15Z) - An Explanation of In-context Learning as Implicit Bayesian Inference [117.19809377740188]
We study the role of the pretraining distribution on the emergence of in-context learning.
We prove that in-context learning occurs implicitly via Bayesian inference of the latent concept.
We empirically find that scaling model size improves in-context accuracy even when the pretraining loss is the same.
arXiv Detail & Related papers (2021-11-03T09:12:33Z) - Modeling Preconditions in Text with a Crowd-sourced Dataset [17.828175478279654]
This paper introduces PeKo, a crowd-sourced annotation of preconditions between event pairs in newswire.
We also introduce two challenge tasks aimed at modeling preconditions.
Evaluation on both tasks shows that modeling preconditions is challenging even for today's large language models.
arXiv Detail & Related papers (2020-10-06T01:52:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.