Don't Prompt, Search! Mining-based Zero-Shot Learning with Language
Models
- URL: http://arxiv.org/abs/2210.14803v1
- Date: Wed, 26 Oct 2022 15:52:30 GMT
- Title: Don't Prompt, Search! Mining-based Zero-Shot Learning with Language
Models
- Authors: Mozes van de Kar, Mengzhou Xia, Danqi Chen, Mikel Artetxe
- Abstract summary: Masked language models like BERT can perform text classification in a zero-shot fashion.
We propose an alternative mining-based approach for zero-shot learning.
- Score: 37.8952605358518
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Masked language models like BERT can perform text classification in a
zero-shot fashion by reformulating downstream tasks as text infilling. However,
this approach is highly sensitive to the template used to prompt the model, yet
practitioners are blind when designing them in strict zero-shot settings. In
this paper, we propose an alternative mining-based approach for zero-shot
learning. Instead of prompting language models, we use regular expressions to
mine labeled examples from unlabeled corpora, which can optionally be filtered
through prompting, and used to finetune a pretrained model. Our method is more
flexible and interpretable than prompting, and outperforms it on a wide range
of tasks when using comparable templates. Our results suggest that the success
of prompting can partly be explained by the model being exposed to similar
examples during pretraining, which can be directly retrieved through regular
expressions.
Related papers
- Language Models for Text Classification: Is In-Context Learning Enough? [54.869097980761595]
Recent foundational language models have shown state-of-the-art performance in many NLP tasks in zero- and few-shot settings.
An advantage of these models over more standard approaches is the ability to understand instructions written in natural language (prompts)
This makes them suitable for addressing text classification problems for domains with limited amounts of annotated instances.
arXiv Detail & Related papers (2024-03-26T12:47:39Z) - Understanding prompt engineering may not require rethinking
generalization [56.38207873589642]
We show that the discrete nature of prompts, combined with a PAC-Bayes prior given by a language model, results in generalization bounds that are remarkably tight by the standards of the literature.
This work provides a possible justification for the widespread practice of prompt engineering.
arXiv Detail & Related papers (2023-10-06T00:52:48Z) - Controllable Speaking Styles Using a Large Language Model [13.642358232817342]
Text-to-Speech (TTS) models can generate multiple, prosodically-different renditions of the same target text.
Currently, controlling these models during inference typically requires finding an appropriate reference utterance.
Here, we give two demonstrations: control of speaking style; prosody appropriate for a given dialogue context.
arXiv Detail & Related papers (2023-05-17T16:01:50Z) - Language Model Pre-Training with Sparse Latent Typing [66.75786739499604]
We propose a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types.
Experimental results show that our model is able to learn interpretable latent type categories in a self-supervised manner without using any external knowledge.
arXiv Detail & Related papers (2022-10-23T00:37:08Z) - What Makes Pre-trained Language Models Better Zero-shot Learners? [12.164678440185007]
Current methods for prompt learning in zeroshot scenarios rely on a development set with sufficient human-annotated data.
We propose a simple yet effective method for screening reasonable prompt templates in zero-shot text classification: Perplexity Selection (Perplection)
Experiments show that our method leads to improved prediction performance in a realistic zero-shot setting, eliminating the need for any labelled examples.
arXiv Detail & Related papers (2022-09-30T03:28:19Z) - Few-shot Prompting Towards Controllable Response Generation [49.479958672988566]
We first explored the combination of prompting and reinforcement learning (RL) to steer models' generation without accessing any of the models' parameters.
We apply multi-task learning to make the model learn to generalize to new tasks better.
Experiment results show that our proposed method can successfully control several state-of-the-art (SOTA) dialogue models without accessing their parameters.
arXiv Detail & Related papers (2022-06-08T14:48:06Z) - Language Models in the Loop: Incorporating Prompting into Weak
Supervision [11.10422546502386]
We propose a new strategy for applying large pre-trained language models to novel tasks when labeled training data is limited.
Instead of applying the model in a typical zero-shot or few-shot fashion, we treat the model as the basis for labeling functions in a weak supervision framework.
arXiv Detail & Related papers (2022-05-04T20:42:40Z) - Eliciting Knowledge from Pretrained Language Models for Prototypical
Prompt Verbalizer [12.596033546002321]
In this paper, we focus on eliciting knowledge from pretrained language models and propose a prototypical prompt verbalizer for prompt-tuning.
For zero-shot settings, knowledge is elicited from pretrained language models by a manually designed template to form initial prototypical embeddings.
For few-shot settings, models are tuned to learn meaningful and interpretable prototypical embeddings.
arXiv Detail & Related papers (2022-01-14T12:04:37Z) - Template-free Prompt Tuning for Few-shot NER [46.59447116255979]
We propose a more elegant method to reformulate NER tasks as LM problems without any templates.
Specifically, we discard the template construction process while maintaining the word prediction paradigm of pre-training models.
Experimental results demonstrate the effectiveness of the proposed method over bert-tagger and template-based method under few-shot setting.
arXiv Detail & Related papers (2021-09-28T07:19:24Z) - Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods
in Natural Language Processing [78.8500633981247]
This paper surveys and organizes research works in a new paradigm in natural language processing, which we dub "prompt-based learning"
Unlike traditional supervised learning, which trains a model to take in an input x and predict an output y as P(y|x), prompt-based learning is based on language models that model the probability of text directly.
arXiv Detail & Related papers (2021-07-28T18:09:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.