Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with
Language Models
- URL: http://arxiv.org/abs/2106.13353v1
- Date: Thu, 24 Jun 2021 23:38:10 GMT
- Title: Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with
Language Models
- Authors: Robert L. Logan IV, Ivana Bala\v{z}evi\'c, Eric Wallace, Fabio
Petroni, Sameer Singh, Sebastian Riedel
- Abstract summary: finetuning language models (LMs) with training examples and task descriptions has been seen as critical to recent successes in few-shot learning.
We show that finetuning LMs in the few-shot setting can considerably reduce the need for prompt engineering.
- Score: 48.0311578882384
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Prompting language models (LMs) with training examples and task descriptions
has been seen as critical to recent successes in few-shot learning. In this
work, we show that finetuning LMs in the few-shot setting can considerably
reduce the need for prompt engineering. In fact, one can use null prompts,
prompts that contain neither task-specific templates nor training examples, and
achieve competitive accuracy to manually-tuned prompts across a wide range of
tasks. While finetuning LMs does introduce new parameters for each downstream
task, we show that this memory overhead can be substantially reduced:
finetuning only the bias terms can achieve comparable or better accuracy than
standard finetuning while only updating 0.1% of the parameters. All in all, we
recommend finetuning LMs for few-shot learning as it is more accurate, robust
to different prompts, and can be made nearly as efficient as using frozen LMs.
Related papers
- Steering Large Language Models for Machine Translation with Finetuning
and In-Context Learning [19.290966101497844]
Large language models (LLMs) are a promising avenue for machine translation (MT)
Their effectiveness highly depends on the choice of few-shot examples and they often require extra post-processing due to overgeneration.
We show that adapter-based finetuning with LoRA matches the performance of traditional finetuning while reducing the number of training parameters by a factor of 50.
arXiv Detail & Related papers (2023-10-20T12:29:51Z) - PALT: Parameter-Lite Transfer of Language Models for Knowledge Graph
Completion [108.8941541255567]
This paper presents a parameter-lite transfer learning approach of pretrained language models (LM) for knowledge graph (KG) completion.
Instead of finetuning, which modifies all LM parameters, we only tune a few new parameters while keeping the original LM parameters fixed.
We show that, by tuning far fewer parameters than finetuning, LMs transfer non-trivially to most tasks and reach competitiveness with prior state-of-the-art approaches.
arXiv Detail & Related papers (2022-10-25T02:22:29Z) - Continued Pretraining for Better Zero- and Few-Shot Promptability [44.381944544918014]
We show that a simple recipe, continued pretraining that incorporates a trainable prompt during multi-task learning, leads to improved promptability in both zero- and few-shot settings.
On the other hand, continued pretraining using MAML-style meta-learning, a method that directly optimize few-shot promptability, yields subpar performance.
arXiv Detail & Related papers (2022-10-19T02:41:51Z) - STT: Soft Template Tuning for Few-Shot Adaptation [72.46535261444151]
We propose a new prompt-tuning framework, called Soft Template Tuning (STT)
STT combines manual and auto prompts, and treats downstream classification tasks as a masked language modeling task.
It can even outperform the time- and resource-consuming fine-tuning method on sentiment classification tasks.
arXiv Detail & Related papers (2022-07-18T07:07:22Z) - Instance-wise Prompt Tuning for Pretrained Language Models [72.74916121511662]
Instance-wise Prompt Tuning (IPT) is the first prompt learning paradigm that injects knowledge from the input data instances to the prompts.
IPT significantly outperforms task-based prompt learning methods, and achieves comparable performance to conventional finetuning with only 0.5% - 1.5% of tuned parameters.
arXiv Detail & Related papers (2022-06-04T10:08:50Z) - Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than
In-Context Learning [81.3514358542452]
Few-shot in-context learning (ICL) incurs substantial computational, memory, and storage costs because it involves processing all of the training examples every time a prediction is made.
parameter-efficient fine-tuning offers an alternative paradigm where a small set of parameters are trained to enable a model to perform the new task.
In this paper, we rigorously compare few-shot ICL and parameter-efficient fine-tuning and demonstrate that the latter offers better accuracy as well as dramatically lower computational costs.
arXiv Detail & Related papers (2022-05-11T17:10:41Z) - PERFECT: Prompt-free and Efficient Few-shot Learning with Language
Models [67.3725459417758]
PERFECT is a simple and efficient method for few-shot fine-tuning of PLMs without relying on any such handcrafting.
We show that manually engineered task prompts can be replaced with task-specific adapters that enable sample-efficient fine-tuning.
Experiments on a wide range of few-shot NLP tasks demonstrate that PERFECT, while being simple and efficient, also outperforms existing state-of-the-art few-shot learning methods.
arXiv Detail & Related papers (2022-04-03T22:31:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.