Related papers: Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models

Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models

URL: http://arxiv.org/abs/2106.13353v1
Date: Thu, 24 Jun 2021 23:38:10 GMT
Title: Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models
Authors: Robert L. Logan IV, Ivana Bala\v{z}evi\'c, Eric Wallace, Fabio Petroni, Sameer Singh, Sebastian Riedel
Abstract summary: finetuning language models (LMs) with training examples and task descriptions has been seen as critical to recent successes in few-shot learning. We show that finetuning LMs in the few-shot setting can considerably reduce the need for prompt engineering.
Score: 48.0311578882384
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Prompting language models (LMs) with training examples and task descriptions has been seen as critical to recent successes in few-shot learning. In this work, we show that finetuning LMs in the few-shot setting can considerably reduce the need for prompt engineering. In fact, one can use null prompts, prompts that contain neither task-specific templates nor training examples, and achieve competitive accuracy to manually-tuned prompts across a wide range of tasks. While finetuning LMs does introduce new parameters for each downstream task, we show that this memory overhead can be substantially reduced: finetuning only the bias terms can achieve comparable or better accuracy than standard finetuning while only updating 0.1% of the parameters. All in all, we recommend finetuning LMs for few-shot learning as it is more accurate, robust to different prompts, and can be made nearly as efficient as using frozen LMs.

Related papers

Steering Large Language Models for Machine Translation with Finetuning and In-Context Learning [19.290966101497844]
Large language models (LLMs) are a promising avenue for machine translation (MT) Their effectiveness highly depends on the choice of few-shot examples and they often require extra post-processing due to overgeneration. We show that adapter-based finetuning with LoRA matches the performance of traditional finetuning while reducing the number of training parameters by a factor of 50.
arXiv Detail & Related papers (2023-10-20T12:29:51Z)
PALT: Parameter-Lite Transfer of Language Models for Knowledge Graph Completion [108.8941541255567]
This paper presents a parameter-lite transfer learning approach of pretrained language models (LM) for knowledge graph (KG) completion. Instead of finetuning, which modifies all LM parameters, we only tune a few new parameters while keeping the original LM parameters fixed. We show that, by tuning far fewer parameters than finetuning, LMs transfer non-trivially to most tasks and reach competitiveness with prior state-of-the-art approaches.
arXiv Detail & Related papers (2022-10-25T02:22:29Z)
Continued Pretraining for Better Zero- and Few-Shot Promptability [44.381944544918014]
We show that a simple recipe, continued pretraining that incorporates a trainable prompt during multi-task learning, leads to improved promptability in both zero- and few-shot settings. On the other hand, continued pretraining using MAML-style meta-learning, a method that directly optimize few-shot promptability, yields subpar performance.
arXiv Detail & Related papers (2022-10-19T02:41:51Z)
STT: Soft Template Tuning for Few-Shot Adaptation [72.46535261444151]
We propose a new prompt-tuning framework, called Soft Template Tuning (STT) STT combines manual and auto prompts, and treats downstream classification tasks as a masked language modeling task. It can even outperform the time- and resource-consuming fine-tuning method on sentiment classification tasks.
arXiv Detail & Related papers (2022-07-18T07:07:22Z)
Instance-wise Prompt Tuning for Pretrained Language Models [72.74916121511662]
Instance-wise Prompt Tuning (IPT) is the first prompt learning paradigm that injects knowledge from the input data instances to the prompts. IPT significantly outperforms task-based prompt learning methods, and achieves comparable performance to conventional finetuning with only 0.5% - 1.5% of tuned parameters.
arXiv Detail & Related papers (2022-06-04T10:08:50Z)
Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning [81.3514358542452]
Few-shot in-context learning (ICL) incurs substantial computational, memory, and storage costs because it involves processing all of the training examples every time a prediction is made. parameter-efficient fine-tuning offers an alternative paradigm where a small set of parameters are trained to enable a model to perform the new task. In this paper, we rigorously compare few-shot ICL and parameter-efficient fine-tuning and demonstrate that the latter offers better accuracy as well as dramatically lower computational costs.
arXiv Detail & Related papers (2022-05-11T17:10:41Z)
PERFECT: Prompt-free and Efficient Few-shot Learning with Language Models [67.3725459417758]
PERFECT is a simple and efficient method for few-shot fine-tuning of PLMs without relying on any such handcrafting. We show that manually engineered task prompts can be replaced with task-specific adapters that enable sample-efficient fine-tuning. Experiments on a wide range of few-shot NLP tasks demonstrate that PERFECT, while being simple and efficient, also outperforms existing state-of-the-art few-shot learning methods.
arXiv Detail & Related papers (2022-04-03T22:31:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.