Towards Adaptive Prefix Tuning for Parameter-Efficient Language Model
Fine-tuning
- URL: http://arxiv.org/abs/2305.15212v1
- Date: Wed, 24 May 2023 14:51:01 GMT
- Title: Towards Adaptive Prefix Tuning for Parameter-Efficient Language Model
Fine-tuning
- Authors: Zhen-Ru Zhang, Chuanqi Tan, Haiyang Xu, Chengyu Wang, Jun Huang,
Songfang Huang
- Abstract summary: We propose Adaptive Prefix Tuning (APT) to adjust the prefix in terms of both fine-grained token level and coarse-grained layer level with a gate mechanism.
Experiments on the SuperGLUE and NER datasets show the effectiveness of APT.
- Score: 32.84435258519842
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Fine-tuning large pre-trained language models on various downstream tasks
with whole parameters is prohibitively expensive. Hence, Parameter-efficient
fine-tuning has attracted attention that only optimizes a few task-specific
parameters with the frozen pre-trained model. In this work, we focus on prefix
tuning, which only optimizes continuous prefix vectors (i.e. pseudo tokens)
inserted into Transformer layers. Based on the observation that the learned
syntax and semantics representation varies a lot at different layers, we argue
that the adaptive prefix will be further tailored to each layer than the fixed
one, enabling the fine-tuning more effective and efficient. Thus, we propose
Adaptive Prefix Tuning (APT) to adjust the prefix in terms of both fine-grained
token level and coarse-grained layer level with a gate mechanism. Experiments
on the SuperGLUE and NER datasets show the effectiveness of APT. In addition,
taking the gate as a probing, we validate the efficiency and effectiveness of
the variable prefix.
Related papers
- Towards Infinite-Long Prefix in Transformer [18.24137806007111]
We study the ability of Prompting and context-based fine-tuning methods to match the performance of full parameter fine-tuning.
We implement an algorithm that only needs to introduce and fine-tune a few extra trainable parameters instead of an infinite-long prefix.
Our method achieves superior or competitive performance compared to existing methods like full parameters fine-tuning, P-Tuning V2, and LoRA.
arXiv Detail & Related papers (2024-06-20T06:56:35Z) - Spectrum-Aware Parameter Efficient Fine-Tuning for Diffusion Models [73.88009808326387]
We propose a novel spectrum-aware adaptation framework for generative models.
Our method adjusts both singular values and their basis vectors of pretrained weights.
We introduce Spectral Ortho Decomposition Adaptation (SODA), which balances computational efficiency and representation capacity.
arXiv Detail & Related papers (2024-05-31T17:43:35Z) - Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation [67.13876021157887]
Dynamic Tuning (DyT) is a novel approach to improve both parameter and inference efficiency for ViT adaptation.
DyT achieves superior performance compared to existing PEFT methods while evoking only 71% of their FLOPs on the VTAB-1K benchmark.
arXiv Detail & Related papers (2024-03-18T14:05:52Z) - Sparse is Enough in Fine-tuning Pre-trained Large Language Models [98.46493578509039]
We propose a gradient-based sparse fine-tuning algorithm, named Sparse Increment Fine-Tuning (SIFT)
We validate its effectiveness on a range of tasks including the GLUE Benchmark and Instruction-tuning.
arXiv Detail & Related papers (2023-12-19T06:06:30Z) - Approximated Prompt Tuning for Vision-Language Pre-trained Models [54.326232586461614]
In vision-language pre-trained models, prompt tuning often requires a large number of learnable tokens to bridge the gap between the pre-training and downstream tasks.
We propose a novel Approximated Prompt Tuning (APT) approach towards efficient VL transfer learning.
arXiv Detail & Related papers (2023-06-27T05:43:47Z) - Prefix Propagation: Parameter-Efficient Tuning for Long Sequences [35.15831629770172]
We propose prefix-propagation, a simple but effective approach that conditions prefixes on previous hidden states.
We empirically demonstrate that prefix-propagation outperforms prefix-tuning across long-document tasks.
To the best of our knowledge, this work is the first to focus on parameter-efficient learning for long-sequence language tasks.
arXiv Detail & Related papers (2023-05-20T04:07:06Z) - Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning [91.5113227694443]
We propose a novel visual.
sensuous-aware fine-Tuning (SPT) scheme.
SPT allocates trainable parameters to task-specific important positions.
Experiments on a wide range of downstream recognition tasks show that our SPT is complementary to the existing PEFT methods.
arXiv Detail & Related papers (2023-03-15T12:34:24Z) - Parameter-Efficient Tuning with Special Token Adaptation [25.37998979962568]
PASTA achieves comparable performance to fine-tuning in natural language understanding tasks.
Our work demonstrates the pivotal role of special tokens in pretrained language models.
arXiv Detail & Related papers (2022-10-10T01:02:51Z) - Prefix-Tuning: Optimizing Continuous Prompts for Generation [85.6357778621526]
Fine-tuning is the de facto way to leverage large pretrained language models to perform downstream tasks.
We propose prefix-tuning, a lightweight alternative to fine-tuning for natural language generation tasks.
We find that by learning only 0.1% of the parameters, prefix-tuning obtains comparable performance in the full data setting.
arXiv Detail & Related papers (2021-01-01T08:00:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.