KronA: Parameter Efficient Tuning with Kronecker Adapter
- URL: http://arxiv.org/abs/2212.10650v1
- Date: Tue, 20 Dec 2022 20:56:52 GMT
- Title: KronA: Parameter Efficient Tuning with Kronecker Adapter
- Authors: Ali Edalati, Marzieh Tahaei, Ivan Kobyzev, Vahid Partovi Nia, James J.
Clark, Mehdi Rezagholizadeh
- Abstract summary: We introduce KronA, a Kronecker product-based adapter module for efficient fine-tuning of Transformer-based PLMs.
We apply the proposed methods for fine-tuning T5 on the GLUE benchmark to show that incorporating the Kronecker-based modules can outperform state-of-the-art PET methods.
- Score: 17.175408603709712
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Fine-tuning a Pre-trained Language Model (PLM) on a specific downstream task
has been a well-known paradigm in Natural Language Processing. However, with
the ever-growing size of PLMs, training the entire model on several downstream
tasks becomes very expensive and resource-hungry. Recently, different Parameter
Efficient Tuning (PET) techniques are proposed to improve the efficiency of
fine-tuning PLMs. One popular category of PET methods is the low-rank
adaptation methods which insert learnable truncated SVD modules into the
original model either sequentially or in parallel. However, low-rank
decomposition suffers from limited representation power. In this work, we
address this problem using the Kronecker product instead of the low-rank
representation. We introduce KronA, a Kronecker product-based adapter module
for efficient fine-tuning of Transformer-based PLMs. We apply the proposed
methods for fine-tuning T5 on the GLUE benchmark to show that incorporating the
Kronecker-based modules can outperform state-of-the-art PET methods.
Related papers
- LoRTA: Low Rank Tensor Adaptation of Large Language Models [70.32218116940393]
Low Rank Adaptation (LoRA) is a popular Efficient Fine Tuning (PEFT) method that effectively adapts large pre-trained models for downstream tasks.
We propose a novel approach that employs a low rank tensor parametrization for model updates.
Our method is both efficient and effective for fine-tuning large language models, achieving a substantial reduction in the number of parameters while maintaining comparable performance.
arXiv Detail & Related papers (2024-10-05T06:59:50Z) - Spectrum-Aware Parameter Efficient Fine-Tuning for Diffusion Models [73.88009808326387]
We propose a novel spectrum-aware adaptation framework for generative models.
Our method adjusts both singular values and their basis vectors of pretrained weights.
We introduce Spectral Ortho Decomposition Adaptation (SODA), which balances computational efficiency and representation capacity.
arXiv Detail & Related papers (2024-05-31T17:43:35Z) - SPAFIT: Stratified Progressive Adaptation Fine-tuning for Pre-trained Large Language Models [1.2263658159556594]
Full fine-tuning is a popular approach to adapt Transformer-based pre-trained large language models to a specific downstream task.
We propose Stratified Progressive Adaptation Fine-tuning (SPAFIT) based on the localization of different types of linguistic knowledge.
Our experiments, conducted on nine tasks from the GLUE benchmark, show that our proposed SPAFIT method outperforms other PEFT methods.
arXiv Detail & Related papers (2024-04-30T21:07:32Z) - Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation [67.13876021157887]
Dynamic Tuning (DyT) is a novel approach to improve both parameter and inference efficiency for ViT adaptation.
DyT achieves superior performance compared to existing PEFT methods while evoking only 71% of their FLOPs on the VTAB-1K benchmark.
arXiv Detail & Related papers (2024-03-18T14:05:52Z) - Parameter and Computation Efficient Transfer Learning for
Vision-Language Pre-trained Models [79.34513906324727]
In this paper, we aim at parameter and efficient transfer learning (PCETL) for vision-language pre-trained models.
We propose a novel dynamic architecture skipping (DAS) approach towards effective PCETL.
arXiv Detail & Related papers (2023-09-04T09:34:33Z) - Exploring the Impact of Model Scaling on Parameter-Efficient Tuning [100.61202305296275]
Scaling-efficient tuning (PET) methods can effectively drive extremely large pre-trained language models (PLMs)
In small PLMs, there are usually noticeable performance differences among PET methods.
We introduce a more flexible PET method called Arbitrary PET (APET) method.
arXiv Detail & Related papers (2023-06-04T10:10:54Z) - PVP: Pre-trained Visual Parameter-Efficient Tuning [29.05396521860764]
Large-scale pre-trained transformers have demonstrated remarkable success in various computer vision tasks.
It is still highly challenging to fully fine-tune these models for downstream tasks due to their high computational and storage costs.
We propose a Pre-trained Visual.
efficient (PVP) Tuning framework, which pre-trains the parameter-efficient tuning modules first and then leverages the pre-trained modules.
arXiv Detail & Related papers (2023-04-26T15:55:29Z) - CHAPTER: Exploiting Convolutional Neural Network Adapters for
Self-supervised Speech Models [62.60723685118747]
Self-supervised learning (SSL) is a powerful technique for learning representations from unlabeled data.
We propose an efficient tuning method specifically designed for SSL speech model, by applying CNN adapters at the feature extractor.
We empirically found that adding CNN to the feature extractor can help the adaptation on emotion and speaker tasks.
arXiv Detail & Related papers (2022-12-01T08:50:12Z) - Parameter-Efficient Tuning on Layer Normalization for Pre-trained
Language Models [1.7185989606499712]
We first propose LN-tuning, by tuning the gain and bias term of Layer Normalization module with only 0.03% parameters.
We study the unified framework of combining LN-tuning with previous ones and we find that: (1) the unified framework of combining prefix-tuning, the adapter-based method working on MHA, and LN-tuning achieves SOTA performance.
arXiv Detail & Related papers (2022-11-16T05:31:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.