SPE: Symmetrical Prompt Enhancement for Fact Probing
- URL: http://arxiv.org/abs/2211.07078v1
- Date: Mon, 14 Nov 2022 03:05:41 GMT
- Title: SPE: Symmetrical Prompt Enhancement for Fact Probing
- Authors: Yiyuan Li, Tong Che, Yezhen Wang, Zhengbao Jiang, Caiming Xiong,
Snigdha Chaturvedi
- Abstract summary: We propose a continuous prompt-based method for factual probing in pretrained language models (PLMs)
Our results on a popular factual probing dataset, LAMA, show significant improvement of SPE over previous probing methods.
- Score: 81.82104239636574
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Pretrained language models (PLMs) have been shown to accumulate factual
knowledge during pretrainingng (Petroni et al., 2019). Recent works probe PLMs
for the extent of this knowledge through prompts either in discrete or
continuous forms. However, these methods do not consider symmetry of the task:
object prediction and subject prediction. In this work, we propose Symmetrical
Prompt Enhancement (SPE), a continuous prompt-based method for factual probing
in PLMs that leverages the symmetry of the task by constructing symmetrical
prompts for subject and object prediction. Our results on a popular factual
probing dataset, LAMA, show significant improvement of SPE over previous
probing methods.
Related papers
- Making Pre-trained Language Models both Task-solvers and
Self-calibrators [52.98858650625623]
Pre-trained language models (PLMs) serve as backbones for various real-world systems.
Previous work shows that introducing an extra calibration task can mitigate this issue.
We propose a training algorithm LM-TOAST to tackle the challenges.
arXiv Detail & Related papers (2023-07-21T02:51:41Z) - Improved Representation of Asymmetrical Distances with Interval
Quasimetric Embeddings [45.69333765438636]
Asymmetrical distance structures (quasimetrics) are ubiquitous in our lives and are gaining more attention in machine learning applications.
We present four desirable properties in such quasimetric models, and show how prior works fail at them.
We propose Interval Quasimetric Embedding (IQE), which is designed to satisfy all four criteria.
arXiv Detail & Related papers (2022-11-28T08:22:26Z) - ADEPT: A DEbiasing PrompT Framework [49.582497203415855]
Finetuning is an applicable approach for debiasing contextualized word embeddings.
discrete prompts with semantic meanings have shown to be effective in debiasing tasks.
We propose ADEPT, a method to debias PLMs using prompt tuning while maintaining the delicate balance between removing biases and ensuring representation ability.
arXiv Detail & Related papers (2022-11-10T08:41:40Z) - Pre-training Language Models with Deterministic Factual Knowledge [42.812774794720895]
We propose to let PLMs learn the deterministic relationship between the remaining context and the masked content.
Two pre-training tasks are introduced to motivate PLMs to rely on the deterministic relationship when filling masks.
Experiments indicate that the continuously pre-trained PLMs achieve better robustness in factual knowledge capturing.
arXiv Detail & Related papers (2022-10-20T11:04:09Z) - Instance-wise Prompt Tuning for Pretrained Language Models [72.74916121511662]
Instance-wise Prompt Tuning (IPT) is the first prompt learning paradigm that injects knowledge from the input data instances to the prompts.
IPT significantly outperforms task-based prompt learning methods, and achieves comparable performance to conventional finetuning with only 0.5% - 1.5% of tuned parameters.
arXiv Detail & Related papers (2022-06-04T10:08:50Z) - Prompt Tuning for Discriminative Pre-trained Language Models [96.04765512463415]
Recent works have shown promising results of prompt tuning in stimulating pre-trained language models (PLMs) for natural language processing (NLP) tasks.
It is still unknown whether and how discriminative PLMs, e.g., ELECTRA, can be effectively prompt-tuned.
We present DPT, the first prompt tuning framework for discriminative PLMs, which reformulates NLP tasks into a discriminative language modeling problem.
arXiv Detail & Related papers (2022-05-23T10:11:50Z) - Masked Language Modeling and the Distributional Hypothesis: Order Word
Matters Pre-training for Little [74.49773960145681]
A possible explanation for the impressive performance of masked language model (MLM)-training is that such models have learned to represent the syntactic structures prevalent in NLP pipelines.
In this paper, we propose a different explanation: pre-trains succeed on downstream tasks almost entirely due to their ability to model higher-order word co-occurrence statistics.
Our results show that purely distributional information largely explains the success of pre-training, and underscore the importance of curating challenging evaluation datasets that require deeper linguistic knowledge.
arXiv Detail & Related papers (2021-04-14T06:30:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.