Prompt2Gaussia: Uncertain Prompt-learning for Script Event Prediction
- URL: http://arxiv.org/abs/2308.02103v1
- Date: Fri, 4 Aug 2023 01:34:46 GMT
- Title: Prompt2Gaussia: Uncertain Prompt-learning for Script Event Prediction
- Authors: Shiyao Cui, Xin Cong, Jiawei Sheng, Xuebin Wang, Tingwen Liu, Jinqiao
Shi
- Abstract summary: Script Event Prediction (SEP) aims to predict the subsequent event for a given event chain from a candidate list.
We consider public pre-trained language models as knowledge bases and automatically mine the script-related knowledge via prompt-learning.
Our method, which benefits from knowledge evoked from pre-trained language models, outperforms prior baselines by 1.46% and 1.05% on two benchmarks.
- Score: 11.54608099442562
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Script Event Prediction (SEP) aims to predict the subsequent event for a
given event chain from a candidate list. Prior research has achieved great
success by integrating external knowledge to enhance the semantics, but it is
laborious to acquisite the appropriate knowledge resources and retrieve the
script-related knowledge. In this paper, we regard public pre-trained language
models as knowledge bases and automatically mine the script-related knowledge
via prompt-learning. Still, the scenario-diversity and label-ambiguity in
scripts make it uncertain to construct the most functional prompt and label
token in prompt learning, i.e., prompt-uncertainty and verbalizer-uncertainty.
Considering the innate ability of Gaussian distribution to express uncertainty,
we deploy the prompt tokens and label tokens as random variables following
Gaussian distributions, where a prompt estimator and a verbalizer estimator are
proposed to estimate their probabilistic representations instead of
deterministic representations. We take the lead to explore prompt-learning in
SEP and provide a fresh perspective to enrich the script semantics. Our method
is evaluated on the most widely used benchmark and a newly proposed large-scale
one. Experiments show that our method, which benefits from knowledge evoked
from pre-trained language models, outperforms prior baselines by 1.46\% and
1.05\% on two benchmarks, respectively.
Related papers
- Context is Key: A Benchmark for Forecasting with Essential Textual Information [87.3175915185287]
"Context is Key" (CiK) is a time series forecasting benchmark that pairs numerical data with diverse types of carefully crafted textual context.
We evaluate a range of approaches, including statistical models, time series foundation models, and LLM-based forecasters.
Our experiments highlight the importance of incorporating contextual information, demonstrate surprising performance when using LLM-based forecasting models, and also reveal some of their critical shortcomings.
arXiv Detail & Related papers (2024-10-24T17:56:08Z) - Self-Evolution Learning for Discriminative Language Model Pretraining [103.57103957631067]
Self-Evolution learning (SE) is a simple and effective token masking and learning method.
SE focuses on learning the informative yet under-explored tokens and adaptively regularizes the training by introducing a novel Token-specific Label Smoothing approach.
arXiv Detail & Related papers (2023-05-24T16:00:54Z) - LaMPP: Language Models as Probabilistic Priors for Perception and Action [38.07277869107474]
We show how to leverage language models for non-linguistic perception and control tasks.
Our approach casts labeling and decision-making as inference in probabilistic graphical models.
arXiv Detail & Related papers (2023-02-03T15:14:04Z) - Don't Be So Sure! Boosting ASR Decoding via Confidence Relaxation [7.056222499095849]
beam search seeks the transcript with the greatest likelihood computed using the predicted distribution.
We show that recently proposed Self-Supervised Learning (SSL)-based ASR models tend to yield exceptionally confident predictions.
We propose a decoding procedure that improves the performance of fine-tuned ASR models.
arXiv Detail & Related papers (2022-12-27T06:42:26Z) - Bayesian Prompt Learning for Image-Language Model Generalization [64.50204877434878]
We use the regularization ability of Bayesian methods to frame prompt learning as a variational inference problem.
Our approach regularizes the prompt space, reduces overfitting to the seen prompts and improves the prompt generalization on unseen prompts.
We demonstrate empirically on 15 benchmarks that Bayesian prompt learning provides an appropriate coverage of the prompt space.
arXiv Detail & Related papers (2022-10-05T17:05:56Z) - LASP: Text-to-Text Optimization for Language-Aware Soft Prompting of
Vision & Language Models [67.19124099815645]
We propose a novel Language-Aware Soft Prompting (LASP) learning method to alleviate base class overfitting.
LASP is inherently amenable to including, during training, virtual classes, i.e. class names for which no visual samples are available.
LASP matches and surpasses, for the first time, the accuracy on novel classes obtained by hand-crafted prompts and CLIP for 8 out of 11 test datasets.
arXiv Detail & Related papers (2022-10-03T17:56:35Z) - DictBERT: Dictionary Description Knowledge Enhanced Language Model
Pre-training via Contrastive Learning [18.838291575019504]
Pre-trained language models (PLMs) are shown to be lacking in knowledge when dealing with knowledge driven tasks.
We propose textbfDictBERT, a novel approach that enhances PLMs with dictionary knowledge.
We evaluate our approach on a variety of knowledge driven and language understanding tasks, including NER, relation extraction, CommonsenseQA, OpenBookQA and GLUE.
arXiv Detail & Related papers (2022-08-01T06:43:19Z) - Supporting Vision-Language Model Inference with Confounder-pruning Knowledge Prompt [71.77504700496004]
Vision-language models are pre-trained by aligning image-text pairs in a common space to deal with open-set visual concepts.
To boost the transferability of the pre-trained models, recent works adopt fixed or learnable prompts.
However, how and what prompts can improve inference performance remains unclear.
arXiv Detail & Related papers (2022-05-23T07:51:15Z) - Probing Script Knowledge from Pre-Trained Models [24.80244106746926]
We design three probing tasks: inclusive sub-event selection, starting sub-event selection and temporal ordering.
The three probing tasks can be further used to automatically induce a script for each main event given all the possible sub-events.
Taking BERT as a case study, we conclude that the stereotypical temporal knowledge among the sub-events is well captured in BERT.
arXiv Detail & Related papers (2022-04-16T05:13:39Z) - A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.
We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z) - REALM: Retrieval-Augmented Language Model Pre-Training [37.3178586179607]
We augment language model pre-training with a latent knowledge retriever, which allows the model to retrieve and attend over documents from a large corpus such as Wikipedia.
For the first time, we show how to pre-train such a knowledge retriever in an unsupervised manner.
We demonstrate the effectiveness of Retrieval-Augmented Language Model pre-training (REALM) by fine-tuning on the challenging task of Open-domain Question Answering (Open-QA)
arXiv Detail & Related papers (2020-02-10T18:40:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.