Semantic-Oriented Unlabeled Priming for Large-Scale Language Models
- URL: http://arxiv.org/abs/2202.06133v1
- Date: Sat, 12 Feb 2022 19:50:59 GMT
- Title: Semantic-Oriented Unlabeled Priming for Large-Scale Language Models
- Authors: Yanchen Liu, Timo Schick, Hinrich Sch\"utze
- Abstract summary: We introduce Semantic-Oriented Unlabeled Priming (SOUP), a method that classifies examples by retrieving semantically similar unlabeled examples.
We also propose bag-of-contexts priming, a new priming strategy that is more suitable for our setting and enables the usage of more examples than fit into the context window.
- Score: 12.074766935042588
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to the high costs associated with finetuning large language models,
various recent works propose to adapt them to specific tasks without any
parameter updates through in-context learning. Unfortunately, for in-context
learning there is currently no way to leverage unlabeled data, which is often
much easier to obtain in large quantities than labeled examples. In this work,
we therefore investigate ways to make use of unlabeled examples to improve the
zero-shot performance of pretrained language models without any finetuning: We
introduce Semantic-Oriented Unlabeled Priming (SOUP), a method that classifies
examples by retrieving semantically similar unlabeled examples, assigning
labels to them in a zero-shot fashion, and then using them for in-context
learning. We also propose bag-of-contexts priming, a new priming strategy that
is more suitable for our setting and enables the usage of more examples than
fit into the context window.
Related papers
- Context-aware Prompt Tuning: Advancing In-Context Learning with Adversarial Methods [69.36397993451742]
This work introduces Context-aware Prompt Tuning (CPT), a method inspired by ICL, PT, and adversarial attacks.
We modify specific context tokens, considering the unique structure of input and output formats.
Inspired by adversarial attacks, we adjust the input based on the labels present in the context, focusing on minimizing, rather than maximizing, the loss.
arXiv Detail & Related papers (2024-10-22T17:45:47Z) - In-Context Learning for Text Classification with Many Labels [34.87532045406169]
In-context learning (ICL) using large language models for tasks with many labels is challenging due to the limited context window.
We use a pre-trained dense retrieval model to bypass this limitation.
We analyze the performance across number of in-context examples and different model scales.
arXiv Detail & Related papers (2023-09-19T22:41:44Z) - EXnet: Efficient In-context Learning for Data-less Text classification [0.0]
We present EXnet, a model specifically designed to perform in-context learning without limitations on the number of examples.
We argue that in-context learning is an effective method to increase task accuracy, and providing examples facilitates cross-task generalization.
With extensive experiments, we show that even our smallest model (15M parameters) generalizes to several unseen classification tasks and domains.
arXiv Detail & Related papers (2023-05-24T01:40:57Z) - CCPrefix: Counterfactual Contrastive Prefix-Tuning for Many-Class
Classification [57.62886091828512]
We propose a brand-new prefix-tuning method, Counterfactual Contrastive Prefix-tuning (CCPrefix) for many-class classification.
Basically, an instance-dependent soft prefix, derived from fact-counterfactual pairs in the label space, is leveraged to complement the language verbalizers in many-class classification.
arXiv Detail & Related papers (2022-11-11T03:45:59Z) - TabLLM: Few-shot Classification of Tabular Data with Large Language
Models [66.03023402174138]
We study the application of large language models to zero-shot and few-shot classification.
We evaluate several serialization methods including templates, table-to-text models, and large language models.
This approach is also competitive with strong traditional baselines like gradient-boosted trees.
arXiv Detail & Related papers (2022-10-19T17:08:13Z) - Selective Annotation Makes Language Models Better Few-Shot Learners [97.07544941620367]
Large language models can perform in-context learning, where they learn a new task from a few task demonstrations.
This work examines the implications of in-context learning for the creation of datasets for new natural language tasks.
We propose an unsupervised, graph-based selective annotation method, voke-k, to select diverse, representative examples to annotate.
arXiv Detail & Related papers (2022-09-05T14:01:15Z) - Learning to Imagine: Diversify Memory for Incremental Learning using
Unlabeled Data [69.30452751012568]
We develop a learnable feature generator to diversify exemplars by adaptively generating diverse counterparts of exemplars.
We introduce semantic contrastive learning to enforce the generated samples to be semantic consistent with exemplars.
Our method does not bring any extra inference cost and outperforms state-of-the-art methods on two benchmarks.
arXiv Detail & Related papers (2022-04-19T15:15:18Z) - On The Ingredients of an Effective Zero-shot Semantic Parser [95.01623036661468]
We analyze zero-shot learning by paraphrasing training examples of canonical utterances and programs from a grammar.
We propose bridging these gaps using improved grammars, stronger paraphrasers, and efficient learning methods.
Our model achieves strong performance on two semantic parsing benchmarks (Scholar, Geo) with zero labeled data.
arXiv Detail & Related papers (2021-10-15T21:41:16Z) - Reordering Examples Helps during Priming-based Few-Shot Learning [6.579039107070663]
We show that PERO can learn to generalize efficiently using as few as 10 examples.
We demonstrate the effectiveness of the proposed method on the tasks of sentiment classification, natural language inference and fact retrieval.
arXiv Detail & Related papers (2021-06-03T11:02:36Z) - Using Sentences as Semantic Representations in Large Scale Zero-Shot
Learning [6.0158981171030685]
Zero-shot learning aims to recognize instances of unseen classes, for which no visual instance is available during training.
A good trade-off could be to employ short sentences in natural language as class descriptions.
We show that while simple methods cannot achieve very good results with sentences alone, a combination of usual word embeddings and sentences can significantly outperform current state-of-the-art.
arXiv Detail & Related papers (2020-10-06T18:22:21Z) - Contextualizing Enhances Gradient Based Meta Learning [7.009032627535598]
We show how to equip meta learning methods with contextualizers and show that their use can significantly boost performance on a range of few shot learning datasets.
Our approach is particularly apt for low-data environments where it is difficult to update parameters without overfitting.
arXiv Detail & Related papers (2020-07-17T04:01:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.