Inducer-tuning: Connecting Prefix-tuning and Adapter-tuning
- URL: http://arxiv.org/abs/2210.14469v1
- Date: Wed, 26 Oct 2022 04:39:42 GMT
- Title: Inducer-tuning: Connecting Prefix-tuning and Adapter-tuning
- Authors: Yifan Chen, Devamanyu Hazarika, Mahdi Namazifar, Yang Liu, Di Jin,
Dilek Hakkani-Tur
- Abstract summary: We show that inducer-tuning can close the performance gap between prefix-tuning and fine-tuning.
We suggest a new variant of prefix-tuning -- textitinducer-tuning, which shares the exact mechanism as prefix-tuning while leveraging the residual form found in adapter-tuning.
- Score: 53.72897232951918
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Prefix-tuning, or more generally continuous prompt tuning, has become an
essential paradigm of parameter-efficient transfer learning. Using a large
pre-trained language model (PLM), prefix-tuning can obtain strong performance
by training only a small portion of parameters. In this paper, we propose to
understand and further develop prefix-tuning through the kernel lens.
Specifically, we make an analogy between \textit{prefixes} and \textit{inducing
variables} in kernel methods and hypothesize that \textit{prefixes} serving as
\textit{inducing variables} would improve their overall mechanism. From the
kernel estimator perspective, we suggest a new variant of prefix-tuning --
\textit{inducer-tuning}, which shares the exact mechanism as prefix-tuning
while leveraging the residual form found in adapter-tuning. This mitigates the
initialization issue in prefix-tuning. Through comprehensive empirical
experiments on natural language understanding and generation tasks, we
demonstrate that inducer-tuning can close the performance gap between
prefix-tuning and fine-tuning.
Related papers
- Towards Infinite-Long Prefix in Transformer [18.24137806007111]
We study the ability of Prompting and context-based fine-tuning methods to match the performance of full parameter fine-tuning.
We implement an algorithm that only needs to introduce and fine-tune a few extra trainable parameters instead of an infinite-long prefix.
Our method achieves superior or competitive performance compared to existing methods like full parameters fine-tuning, P-Tuning V2, and LoRA.
arXiv Detail & Related papers (2024-06-20T06:56:35Z) - Prompting a Pretrained Transformer Can Be a Universal Approximator [105.59562522323274]
We show that much smaller pretrained models than previously thought can be universal approximators when prefixed.
We also offer Jackson-type bounds on the length of the prefix needed to approximate a function to a desired precision.
arXiv Detail & Related papers (2024-02-22T18:12:48Z) - Universality and Limitations of Prompt Tuning [65.8354898840308]
We take one of the first steps to understand the role of soft-prompt tuning for transformer-based architectures.
We analyze prompt tuning from the lens of universality and limitations with finite-depth pretrained transformers for continuous-valued functions.
Our result guarantees the existence of a strong transformer with a prompt to approximate any sequence-to-sequence function in the set of Lipschitz functions.
arXiv Detail & Related papers (2023-05-30T06:47:07Z) - PIP: Parse-Instructed Prefix for Syntactically Controlled Paraphrase
Generation [61.05254852400895]
Parse-Instructed Prefix (PIP) is a novel adaptation of prefix-tuning to tune large pre-trained language models.
In contrast to traditional fine-tuning methods for this task, PIP is a compute-efficient alternative with 10 times less learnable parameters.
arXiv Detail & Related papers (2023-05-26T07:42:38Z) - Towards Adaptive Prefix Tuning for Parameter-Efficient Language Model
Fine-tuning [32.84435258519842]
We propose Adaptive Prefix Tuning (APT) to adjust the prefix in terms of both fine-grained token level and coarse-grained layer level with a gate mechanism.
Experiments on the SuperGLUE and NER datasets show the effectiveness of APT.
arXiv Detail & Related papers (2023-05-24T14:51:01Z) - Prefix Propagation: Parameter-Efficient Tuning for Long Sequences [35.15831629770172]
We propose prefix-propagation, a simple but effective approach that conditions prefixes on previous hidden states.
We empirically demonstrate that prefix-propagation outperforms prefix-tuning across long-document tasks.
To the best of our knowledge, this work is the first to focus on parameter-efficient learning for long-sequence language tasks.
arXiv Detail & Related papers (2023-05-20T04:07:06Z) - Empowering parameter-efficient transfer learning by recognizing the
kernel structure in self-attention [53.72897232951918]
We propose adapters that utilize the kernel structure in self-attention to guide the assignment of tunable parameters.
Our results show that our proposed adapters can attain or improve the strong performance of existing baselines.
arXiv Detail & Related papers (2022-05-07T20:52:54Z) - On Robust Prefix-Tuning for Text Classification [16.08753509741376]
We propose a robust prefix-tuning framework that preserves the efficiency and modularity of prefix-tuning.
Our framework substantially improves robustness over several strong baselines against five textual attacks of different types.
arXiv Detail & Related papers (2022-03-19T18:52:47Z) - Prefix-Tuning: Optimizing Continuous Prompts for Generation [85.6357778621526]
Fine-tuning is the de facto way to leverage large pretrained language models to perform downstream tasks.
We propose prefix-tuning, a lightweight alternative to fine-tuning for natural language generation tasks.
We find that by learning only 0.1% of the parameters, prefix-tuning obtains comparable performance in the full data setting.
arXiv Detail & Related papers (2021-01-01T08:00:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.