Input-Tuning: Adapting Unfamiliar Inputs to Frozen Pretrained Models
- URL: http://arxiv.org/abs/2203.03131v1
- Date: Mon, 7 Mar 2022 05:04:32 GMT
- Title: Input-Tuning: Adapting Unfamiliar Inputs to Frozen Pretrained Models
- Authors: Shengnan An, Yifei Li, Zeqi Lin, Qian Liu, Bei Chen, Qiang Fu, Weizhu
Chen, Nanning Zheng and Jian-Guang Lou
- Abstract summary: We argue that one of the factors hindering the development of prompt-tuning on NLG tasks is the unfamiliar inputs.
This motivates us to propose input-tuning, which fine-tunes both the continuous prompts and the input representations.
Our proposed input-tuning is conceptually simple and empirically powerful.
- Score: 82.75572875007755
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently the prompt-tuning paradigm has attracted significant attention. By
only tuning continuous prompts with a frozen pre-trained language model (PLM),
prompt-tuning takes a step towards deploying a shared frozen PLM to serve
numerous downstream tasks. Although prompt-tuning shows good performance on
certain natural language understanding (NLU) tasks, its effectiveness on
natural language generation (NLG) tasks is still under-explored. In this paper,
we argue that one of the factors hindering the development of prompt-tuning on
NLG tasks is the unfamiliar inputs (i.e., inputs are linguistically different
from the pretraining corpus). For example, our preliminary exploration reveals
a large performance gap between prompt-tuning and fine-tuning when unfamiliar
inputs occur frequently in NLG tasks. This motivates us to propose
input-tuning, which fine-tunes both the continuous prompts and the input
representations, leading to a more effective way to adapt unfamiliar inputs to
frozen PLMs. Our proposed input-tuning is conceptually simple and empirically
powerful. Experimental results on seven NLG tasks demonstrate that input-tuning
is significantly and consistently better than prompt-tuning. Furthermore, on
three of these tasks, input-tuning can achieve a comparable or even better
performance than fine-tuning.
Related papers
- Generative Input: Towards Next-Generation Input Methods Paradigm [49.98958865125018]
We propose a novel Generative Input paradigm named GeneInput.
It uses prompts to handle all input scenarios and other intelligent auxiliary input functions, optimizing the model with user feedback to deliver personalized results.
The results demonstrate that we have achieved state-of-the-art performance for the first time in the Full-mode Key-sequence to Characters(FK2C) task.
arXiv Detail & Related papers (2023-11-02T12:01:29Z) - On the Role of Attention in Prompt-tuning [90.97555030446563]
We study prompt-tuning for one-layer attention architectures and study contextual mixture-models.
We show that softmax-prompt-attention is provably more expressive than softmax-self-attention and linear-prompt-attention.
We also provide experiments that verify our theoretical insights on real datasets and demonstrate how prompt-tuning enables the model to attend to context-relevant information.
arXiv Detail & Related papers (2023-06-06T06:23:38Z) - Dynamic Prompting: A Unified Framework for Prompt Tuning [33.175097465669374]
We present a unified dynamic prompt (DP) tuning strategy that dynamically determines different factors of prompts based on specific tasks and instances.
Experimental results underscore the significant performance improvement achieved by dynamic prompt tuning across a wide range of tasks.
We establish the universal applicability of our approach under full-data, few-shot, and multitask scenarios.
arXiv Detail & Related papers (2023-03-06T06:04:46Z) - XPrompt: Exploring the Extreme of Prompt Tuning [31.242680485717447]
We propose a novel Prompt tuning model with an eXtremely small scale (XPrompt) under the regime of lottery tickets hypothesis.
XPrompt eliminates the negative prompt tokens at different levels through a hierarchical structured pruning, yielding a more parameter-efficient prompt yet with a competitive performance.
arXiv Detail & Related papers (2022-10-10T06:57:19Z) - Instance-wise Prompt Tuning for Pretrained Language Models [72.74916121511662]
Instance-wise Prompt Tuning (IPT) is the first prompt learning paradigm that injects knowledge from the input data instances to the prompts.
IPT significantly outperforms task-based prompt learning methods, and achieves comparable performance to conventional finetuning with only 0.5% - 1.5% of tuned parameters.
arXiv Detail & Related papers (2022-06-04T10:08:50Z) - IDPG: An Instance-Dependent Prompt Generation Method [58.45110542003139]
Prompt tuning is a new, efficient NLP transfer learning paradigm that adds a task-specific prompt in each input instance during the model training stage.
We propose a conditional prompt generation method to generate prompts for each input instance.
arXiv Detail & Related papers (2022-04-09T15:45:27Z) - Making Pre-trained Language Models End-to-end Few-shot Learners with
Contrastive Prompt Tuning [41.15017636192417]
We present CP-Tuning, the first end-to-end Contrastive Prompt Tuning framework for fine-tuning Language Models.
It is integrated with the task-invariant continuous prompt encoding technique with fully trainable prompt parameters.
Experiments over a variety of language understanding tasks used in IR systems and different PLMs show that CP-Tuning outperforms state-of-the-art methods.
arXiv Detail & Related papers (2022-04-01T02:24:24Z) - AdaPrompt: Adaptive Model Training for Prompt-based NLP [77.12071707955889]
We propose AdaPrompt, adaptively retrieving external data for continual pretraining of PLMs.
Experimental results on five NLP benchmarks show that AdaPrompt can improve over standard PLMs in few-shot settings.
In zero-shot settings, our method outperforms standard prompt-based methods by up to 26.35% relative error reduction.
arXiv Detail & Related papers (2022-02-10T04:04:57Z) - GPT Understands, Too [42.701765107498346]
We propose a novel method P-Tuning that employs trainable continuous prompt embeddings in concatenation with discrete prompts.
P-Tuning is generally effective for both frozen and tuned language models, under both the fully-supervised and few-shot settings.
arXiv Detail & Related papers (2021-03-18T17:13:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.