Template-free Prompt Tuning for Few-shot NER
- URL: http://arxiv.org/abs/2109.13532v1
- Date: Tue, 28 Sep 2021 07:19:24 GMT
- Title: Template-free Prompt Tuning for Few-shot NER
- Authors: Ruotian Ma, Xin Zhou, Tao Gui, Yiding Tan, Qi Zhang, Xuanjing Huang
- Abstract summary: We propose a more elegant method to reformulate NER tasks as LM problems without any templates.
Specifically, we discard the template construction process while maintaining the word prediction paradigm of pre-training models.
Experimental results demonstrate the effectiveness of the proposed method over bert-tagger and template-based method under few-shot setting.
- Score: 46.59447116255979
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Prompt-based methods have been successfully applied in sentence-level
few-shot learning tasks, mostly owing to the sophisticated design of templates
and label words. However, when applied to token-level labeling tasks such as
NER, it would be time-consuming to enumerate the template queries over all
potential entity spans. In this work, we propose a more elegant method to
reformulate NER tasks as LM problems without any templates. Specifically, we
discard the template construction process while maintaining the word prediction
paradigm of pre-training models to predict a class-related pivot word (or label
word) at the entity position. Meanwhile, we also explore principled ways to
automatically search for appropriate label words that the pre-trained models
can easily adapt to. While avoiding complicated template-based process, the
proposed LM objective also reduces the gap between different objectives used in
pre-training and fine-tuning, thus it can better benefit the few-shot
performance. Experimental results demonstrate the effectiveness of the proposed
method over bert-tagger and template-based method under few-shot setting.
Moreover, the decoding speed of the proposed method is up to 1930.12 times
faster than the template-based method.
Related papers
- Mind Your Format: Towards Consistent Evaluation of In-Context Learning Improvements [10.687101698324897]
Large language models demonstrate a remarkable capability for learning to solve new tasks from a few examples.
The prompt template, or the way the input examples are formatted to obtain the prompt, is an important yet often overlooked aspect of in-context learning.
We show that a poor choice of the template can reduce the performance of the strongest models and inference methods to a random guess level.
arXiv Detail & Related papers (2024-01-12T18:58:26Z) - A Quality-based Syntactic Template Retriever for
Syntactically-controlled Paraphrase Generation [67.98367574025797]
Existing syntactically-controlled paraphrase generation models perform promisingly with human-annotated or well-chosen syntactic templates.
The prohibitive cost makes it unfeasible to manually design decent templates for every source sentence.
We propose a novel Quality-based Syntactic Template Retriever (QSTR) to retrieve templates based on the quality of the to-be-generated paraphrases.
arXiv Detail & Related papers (2023-10-20T03:55:39Z) - ProTeCt: Prompt Tuning for Taxonomic Open Set Classification [59.59442518849203]
Few-shot adaptation methods do not fare well in the taxonomic open set (TOS) setting.
We propose a prompt tuning technique that calibrates the hierarchical consistency of model predictions.
A new Prompt Tuning for Hierarchical Consistency (ProTeCt) technique is then proposed to calibrate classification across label set granularities.
arXiv Detail & Related papers (2023-06-04T02:55:25Z) - STPrompt: Semantic-guided and Task-driven prompts for Effective Few-shot
Classification [5.6205035780719275]
We propose the STPrompt -Semantic-guided and Task-driven Prompt model.
The proposed model achieves the state-of-the-art performance in five different datasets of few-shot text classification tasks.
arXiv Detail & Related papers (2022-10-29T04:42:30Z) - Don't Prompt, Search! Mining-based Zero-Shot Learning with Language
Models [37.8952605358518]
Masked language models like BERT can perform text classification in a zero-shot fashion.
We propose an alternative mining-based approach for zero-shot learning.
arXiv Detail & Related papers (2022-10-26T15:52:30Z) - An Exploration of Prompt Tuning on Generative Spoken Language Model for
Speech Processing Tasks [112.1942546460814]
We report the first exploration of the prompt tuning paradigm for speech processing tasks based on Generative Spoken Language Model (GSLM)
Experiment results show that the prompt tuning technique achieves competitive performance in speech classification tasks with fewer trainable parameters than fine-tuning specialized downstream models.
arXiv Detail & Related papers (2022-03-31T03:26:55Z) - An Information-theoretic Approach to Prompt Engineering Without Ground
Truth Labels [55.06990011183662]
We introduce a new method for selecting prompt templates textitwithout labeled examples and textitwithout direct access to the model.
Across 8 datasets representing 7 distinct NLP tasks, we show that when a template has high mutual information, it also has high accuracy on the task.
arXiv Detail & Related papers (2022-03-21T21:51:43Z) - Eliciting Knowledge from Pretrained Language Models for Prototypical
Prompt Verbalizer [12.596033546002321]
In this paper, we focus on eliciting knowledge from pretrained language models and propose a prototypical prompt verbalizer for prompt-tuning.
For zero-shot settings, knowledge is elicited from pretrained language models by a manually designed template to form initial prototypical embeddings.
For few-shot settings, models are tuned to learn meaningful and interpretable prototypical embeddings.
arXiv Detail & Related papers (2022-01-14T12:04:37Z) - NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original
Pre-training Task--Next Sentence Prediction [14.912579358678212]
Using prompts to perform various downstream tasks, also known as prompt-based learning or prompt-learning, has lately gained significant success in comparison to the pre-train and fine-tune paradigm.
In this paper, we attempt to accomplish several NLP tasks in a zero-shot scenario using a BERT original pre-training task abandoned by RoBERTa and other models--Next Sentence Prediction (NSP)
Unlike token-level techniques, our sentence-level prompt-based method NSP-BERT does not need to fix the length of the prompt or the position to be predicted, allowing it to handle tasks such as entity linking
arXiv Detail & Related papers (2021-09-08T11:57:08Z) - Pre-training Is (Almost) All You Need: An Application to Commonsense
Reasoning [61.32992639292889]
Fine-tuning of pre-trained transformer models has become the standard approach for solving common NLP tasks.
We introduce a new scoring method that casts a plausibility ranking task in a full-text format.
We show that our method provides a much more stable training phase across random restarts.
arXiv Detail & Related papers (2020-04-29T10:54:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.