Related papers: Eliciting Knowledge from Pretrained Language Models for Prototypical Prompt Verbalizer

Eliciting Knowledge from Pretrained Language Models for Prototypical Prompt Verbalizer

URL: http://arxiv.org/abs/2201.05411v1
Date: Fri, 14 Jan 2022 12:04:37 GMT
Title: Eliciting Knowledge from Pretrained Language Models for Prototypical Prompt Verbalizer
Authors: Yinyi Wei, Tong Mo, Yongtao Jiang, Weiping Li, Wen Zhao
Abstract summary: In this paper, we focus on eliciting knowledge from pretrained language models and propose a prototypical prompt verbalizer for prompt-tuning. For zero-shot settings, knowledge is elicited from pretrained language models by a manually designed template to form initial prototypical embeddings. For few-shot settings, models are tuned to learn meaningful and interpretable prototypical embeddings.
Score: 12.596033546002321
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advances on prompt-tuning cast few-shot classification tasks as a masked language modeling problem. By wrapping input into a template and using a verbalizer which constructs a mapping between label space and label word space, prompt-tuning can achieve excellent results in zero-shot and few-shot scenarios. However, typical prompt-tuning needs a manually designed verbalizer which requires domain expertise and human efforts. And the insufficient label space may introduce considerable bias into the results. In this paper, we focus on eliciting knowledge from pretrained language models and propose a prototypical prompt verbalizer for prompt-tuning. Labels are represented by prototypical embeddings in the feature space rather than by discrete words. The distances between the embedding at the masked position of input and prototypical embeddings are used as classification criterion. For zero-shot settings, knowledge is elicited from pretrained language models by a manually designed template to form initial prototypical embeddings. For few-shot settings, models are tuned to learn meaningful and interpretable prototypical embeddings. Our method optimizes models by contrastive learning. Extensive experimental results on several many-class text classification datasets with low-resource settings demonstrate the effectiveness of our approach compared with other verbalizer construction methods. Our implementation is available at https://github.com/Ydongd/prototypical-prompt-verbalizer.

Related papers

Manual Verbalizer Enrichment for Few-Shot Text Classification [1.860409237919611]
acrshortmave is an approach for verbalizer construction by enrichment of class labels. Our model achieves state-of-the-art results while using significantly fewer resources.
arXiv Detail & Related papers (2024-10-08T16:16:47Z)
Language Models for Text Classification: Is In-Context Learning Enough? [54.869097980761595]
Recent foundational language models have shown state-of-the-art performance in many NLP tasks in zero- and few-shot settings. An advantage of these models over more standard approaches is the ability to understand instructions written in natural language (prompts) This makes them suitable for addressing text classification problems for domains with limited amounts of annotated instances.
arXiv Detail & Related papers (2024-03-26T12:47:39Z)
MetricPrompt: Prompting Model as a Relevance Metric for Few-shot Text Classification [65.51149771074944]
MetricPrompt eases verbalizer design difficulty by reformulating few-shot text classification task into text pair relevance estimation task. We conduct experiments on three widely used text classification datasets across four few-shot settings. Results show that MetricPrompt outperforms manual verbalizer and other automatic verbalizer design methods across all few-shot settings.
arXiv Detail & Related papers (2023-06-15T06:51:35Z)
CCPrefix: Counterfactual Contrastive Prefix-Tuning for Many-Class Classification [57.62886091828512]
We propose a brand-new prefix-tuning method, Counterfactual Contrastive Prefix-tuning (CCPrefix) for many-class classification. Basically, an instance-dependent soft prefix, derived from fact-counterfactual pairs in the label space, is leveraged to complement the language verbalizers in many-class classification.
arXiv Detail & Related papers (2022-11-11T03:45:59Z)
Don't Prompt, Search! Mining-based Zero-Shot Learning with Language Models [37.8952605358518]
Masked language models like BERT can perform text classification in a zero-shot fashion. We propose an alternative mining-based approach for zero-shot learning.
arXiv Detail & Related papers (2022-10-26T15:52:30Z)
Language Model Pre-Training with Sparse Latent Typing [66.75786739499604]
We propose a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types. Experimental results show that our model is able to learn interpretable latent type categories in a self-supervised manner without using any external knowledge.
arXiv Detail & Related papers (2022-10-23T00:37:08Z)
Prototypical Verbalizer for Prompt-based Few-shot Tuning [32.74024339482436]
We propose the verbalizer (ProtoVerb) which is built directly from training data. ProtoVerb learns prototype vectors as prototypical verbalizers by contrastive learning. We conduct experiments on both topic classification and entity typing tasks, and the results demonstrate that ProtoVerb significantly outperforms current automatic verbalizers.
arXiv Detail & Related papers (2022-03-18T07:07:56Z)
Knowledgeable Prompt-tuning: Incorporating Knowledge into Prompt Verbalizer for Text Classification [68.3291372168167]
We focus on incorporating external knowledge into the verbalizer, forming a knowledgeable prompt-tuning (KPT) We expand the label word space of the verbalizer using external knowledge bases (KBs) and refine the expanded label word space with the PLM itself before predicting with the expanded label word space. Experiments on zero and few-shot text classification tasks demonstrate the effectiveness of knowledgeable prompt-tuning.
arXiv Detail & Related papers (2021-08-04T13:00:16Z)
Grounded Compositional Outputs for Adaptive Language Modeling [59.02706635250856]
A language model's vocabulary$-$typically selected before training and permanently fixed later$-$affects its size. We propose a fully compositional output embedding layer for language models. To our knowledge, the result is the first word-level language model with a size that does not depend on the training vocabulary.
arXiv Detail & Related papers (2020-09-24T07:21:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.