Eliciting Knowledge from Pretrained Language Models for Prototypical
Prompt Verbalizer
- URL: http://arxiv.org/abs/2201.05411v1
- Date: Fri, 14 Jan 2022 12:04:37 GMT
- Title: Eliciting Knowledge from Pretrained Language Models for Prototypical
Prompt Verbalizer
- Authors: Yinyi Wei, Tong Mo, Yongtao Jiang, Weiping Li, Wen Zhao
- Abstract summary: In this paper, we focus on eliciting knowledge from pretrained language models and propose a prototypical prompt verbalizer for prompt-tuning.
For zero-shot settings, knowledge is elicited from pretrained language models by a manually designed template to form initial prototypical embeddings.
For few-shot settings, models are tuned to learn meaningful and interpretable prototypical embeddings.
- Score: 12.596033546002321
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances on prompt-tuning cast few-shot classification tasks as a
masked language modeling problem. By wrapping input into a template and using a
verbalizer which constructs a mapping between label space and label word space,
prompt-tuning can achieve excellent results in zero-shot and few-shot
scenarios. However, typical prompt-tuning needs a manually designed verbalizer
which requires domain expertise and human efforts. And the insufficient label
space may introduce considerable bias into the results. In this paper, we focus
on eliciting knowledge from pretrained language models and propose a
prototypical prompt verbalizer for prompt-tuning. Labels are represented by
prototypical embeddings in the feature space rather than by discrete words. The
distances between the embedding at the masked position of input and
prototypical embeddings are used as classification criterion. For zero-shot
settings, knowledge is elicited from pretrained language models by a manually
designed template to form initial prototypical embeddings. For few-shot
settings, models are tuned to learn meaningful and interpretable prototypical
embeddings. Our method optimizes models by contrastive learning. Extensive
experimental results on several many-class text classification datasets with
low-resource settings demonstrate the effectiveness of our approach compared
with other verbalizer construction methods. Our implementation is available at
https://github.com/Ydongd/prototypical-prompt-verbalizer.
Related papers
- Manual Verbalizer Enrichment for Few-Shot Text Classification [1.860409237919611]
acrshortmave is an approach for verbalizer construction by enrichment of class labels.
Our model achieves state-of-the-art results while using significantly fewer resources.
arXiv Detail & Related papers (2024-10-08T16:16:47Z) - Language Models for Text Classification: Is In-Context Learning Enough? [54.869097980761595]
Recent foundational language models have shown state-of-the-art performance in many NLP tasks in zero- and few-shot settings.
An advantage of these models over more standard approaches is the ability to understand instructions written in natural language (prompts)
This makes them suitable for addressing text classification problems for domains with limited amounts of annotated instances.
arXiv Detail & Related papers (2024-03-26T12:47:39Z) - MetricPrompt: Prompting Model as a Relevance Metric for Few-shot Text
Classification [65.51149771074944]
MetricPrompt eases verbalizer design difficulty by reformulating few-shot text classification task into text pair relevance estimation task.
We conduct experiments on three widely used text classification datasets across four few-shot settings.
Results show that MetricPrompt outperforms manual verbalizer and other automatic verbalizer design methods across all few-shot settings.
arXiv Detail & Related papers (2023-06-15T06:51:35Z) - CCPrefix: Counterfactual Contrastive Prefix-Tuning for Many-Class
Classification [57.62886091828512]
We propose a brand-new prefix-tuning method, Counterfactual Contrastive Prefix-tuning (CCPrefix) for many-class classification.
Basically, an instance-dependent soft prefix, derived from fact-counterfactual pairs in the label space, is leveraged to complement the language verbalizers in many-class classification.
arXiv Detail & Related papers (2022-11-11T03:45:59Z) - Don't Prompt, Search! Mining-based Zero-Shot Learning with Language
Models [37.8952605358518]
Masked language models like BERT can perform text classification in a zero-shot fashion.
We propose an alternative mining-based approach for zero-shot learning.
arXiv Detail & Related papers (2022-10-26T15:52:30Z) - Language Model Pre-Training with Sparse Latent Typing [66.75786739499604]
We propose a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types.
Experimental results show that our model is able to learn interpretable latent type categories in a self-supervised manner without using any external knowledge.
arXiv Detail & Related papers (2022-10-23T00:37:08Z) - Prototypical Verbalizer for Prompt-based Few-shot Tuning [32.74024339482436]
We propose the verbalizer (ProtoVerb) which is built directly from training data.
ProtoVerb learns prototype vectors as prototypical verbalizers by contrastive learning.
We conduct experiments on both topic classification and entity typing tasks, and the results demonstrate that ProtoVerb significantly outperforms current automatic verbalizers.
arXiv Detail & Related papers (2022-03-18T07:07:56Z) - Knowledgeable Prompt-tuning: Incorporating Knowledge into Prompt
Verbalizer for Text Classification [68.3291372168167]
We focus on incorporating external knowledge into the verbalizer, forming a knowledgeable prompt-tuning (KPT)
We expand the label word space of the verbalizer using external knowledge bases (KBs) and refine the expanded label word space with the PLM itself before predicting with the expanded label word space.
Experiments on zero and few-shot text classification tasks demonstrate the effectiveness of knowledgeable prompt-tuning.
arXiv Detail & Related papers (2021-08-04T13:00:16Z) - Grounded Compositional Outputs for Adaptive Language Modeling [59.02706635250856]
A language model's vocabulary$-$typically selected before training and permanently fixed later$-$affects its size.
We propose a fully compositional output embedding layer for language models.
To our knowledge, the result is the first word-level language model with a size that does not depend on the training vocabulary.
arXiv Detail & Related papers (2020-09-24T07:21:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.