AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning
- URL: http://arxiv.org/abs/2303.10512v2
- Date: Wed, 20 Dec 2023 20:56:14 GMT
- Title: AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning
- Authors: Qingru Zhang, Minshuo Chen, Alexander Bukharin, Nikos Karampatziakis,
Pengcheng He, Yu Cheng, Weizhu Chen, Tuo Zhao
- Abstract summary: Fine-tuning large pre-trained language models on downstream tasks has become an important paradigm in NLP.
We propose AdaLoRA, which adaptively allocates the parameter budget among weight matrices according to their importance score.
We conduct extensive experiments with several pre-trained models on natural language processing, question answering, and natural language generation to validate the effectiveness of AdaLoRA.
- Score: 143.23123791557245
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Fine-tuning large pre-trained language models on downstream tasks has become
an important paradigm in NLP. However, common practice fine-tunes all of the
parameters in a pre-trained model, which becomes prohibitive when a large
number of downstream tasks are present. Therefore, many fine-tuning methods are
proposed to learn incremental updates of pre-trained weights in a parameter
efficient way, e.g., low-rank increments. These methods often evenly distribute
the budget of incremental updates across all pre-trained weight matrices, and
overlook the varying importance of different weight parameters. As a
consequence, the fine-tuning performance is suboptimal. To bridge this gap, we
propose AdaLoRA, which adaptively allocates the parameter budget among weight
matrices according to their importance score. In particular, AdaLoRA
parameterizes the incremental updates in the form of singular value
decomposition. Such a novel approach allows us to effectively prune the
singular values of unimportant updates, which is essentially to reduce their
parameter budget but circumvent intensive exact SVD computations. We conduct
extensive experiments with several pre-trained models on natural language
processing, question answering, and natural language generation to validate the
effectiveness of AdaLoRA. Results demonstrate that AdaLoRA manifests notable
improvement over baselines, especially in the low budget settings. Our code is
publicly available at https://github.com/QingruZhang/AdaLoRA .
Related papers
- IntLoRA: Integral Low-rank Adaptation of Quantized Diffusion Models [68.55148272295916]
We propose IntLoRA, to push the efficiency limits by using integer type (INT) low-rank parameters to adapt the quantized diffusion models.
IntLoRA offers three key advantages: (i) for fine-tuning, the pre-trained weights are quantized, reducing memory usage; (ii) for storage, both pre-trained and low-rank weights are in INT which consumes less disk space; (iii) for inference, IntLoRA weights can be naturally merged into quantized pre-trained weights through efficient integer multiplication or bit-shifting.
arXiv Detail & Related papers (2024-10-29T05:50:17Z) - LoRTA: Low Rank Tensor Adaptation of Large Language Models [70.32218116940393]
Low Rank Adaptation (LoRA) is a popular Efficient Fine Tuning (PEFT) method that effectively adapts large pre-trained models for downstream tasks.
We propose a novel approach that employs a low rank tensor parametrization for model updates.
Our method is both efficient and effective for fine-tuning large language models, achieving a substantial reduction in the number of parameters while maintaining comparable performance.
arXiv Detail & Related papers (2024-10-05T06:59:50Z) - NEAT: Nonlinear Parameter-efficient Adaptation of Pre-trained Models [26.808251361020066]
Fine-tuning pre-trained models is resource-intensive and laborious.
One widely adopted PEFT technique, Low-Rank Adaptation (LoRA), freezes the pre-trained model weights.
NEAT introduces a lightweight neural network that takes pre-trained weights as input and learns a nonlinear transformation to approximate cumulative weight updates.
arXiv Detail & Related papers (2024-10-02T17:29:23Z) - SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation [52.6922833948127]
In this work, we investigate the importance of parameters in pre-trained diffusion models.
We propose a novel model fine-tuning method to make full use of these ineffective parameters.
Our method enhances the generative capabilities of pre-trained models in downstream applications.
arXiv Detail & Related papers (2024-09-10T16:44:47Z) - SARA: Singular-Value Based Adaptive Low-Rank Adaption [4.135688713311511]
LoRA as a parameter-efficient fine-tuning(PEFT) method is widely used for not adding inference overhead.
In this work, we first analyze the relationship between the performance of different layers and their ranks using SVD.
Based on this, we design the Singular-Value Based Adaptive Low-Rank Adaption(SARA)
arXiv Detail & Related papers (2024-08-06T16:39:42Z) - RoseLoRA: Row and Column-wise Sparse Low-rank Adaptation of Pre-trained Language Model for Knowledge Editing and Fine-tuning [36.32145845869823]
Pre-trained language models demonstrate strong generalizability across various NLP tasks.
Fine-tuning these models for specific tasks typically involves updating all parameters, which is resource-intensive.
We propose a novel PEFT method, which conducts textbfrow and ctextbfolumn-wise spartextbfse textbflow-textbfrank textbfadaptation (RoseLoRA)
RoseLoRA identifies and updates only the most important parameters for a specific task, maintaining efficiency
arXiv Detail & Related papers (2024-06-16T02:08:49Z) - MELoRA: Mini-Ensemble Low-Rank Adapters for Parameter-Efficient Fine-Tuning [71.50432879573614]
Low-rank adaptation (LoRA) is based on the idea that the adaptation process is intrinsically low-dimensional.
We present MELoRA, a mini-ensemble low-rank adapters that uses fewer trainable parameters while maintaining a higher rank.
Our experimental results show that, compared to LoRA, MELoRA achieves better performance with 8 times fewer trainable parameters on natural language understanding tasks and 36 times fewer trainable parameters on instruction following tasks.
arXiv Detail & Related papers (2024-02-27T07:14:12Z) - Sparse Low-rank Adaptation of Pre-trained Language Models [79.74094517030035]
We introduce sparse low-rank adaptation (SoRA) that enables dynamic adjustments to the intrinsic rank during the adaptation process.
Our approach strengthens the representation power of LoRA by initializing it with a higher rank, while efficiently taming a temporarily increased number of parameters.
Our experimental results demonstrate that SoRA can outperform other baselines even with 70% retained parameters and 70% training time.
arXiv Detail & Related papers (2023-11-20T11:56:25Z) - IncreLoRA: Incremental Parameter Allocation Method for
Parameter-Efficient Fine-tuning [15.964205804768163]
IncreLoRA is an incremental parameter allocation method that adaptively adds trainable parameters during training.
We conduct extensive experiments on GLUE to demonstrate the effectiveness of IncreLoRA.
arXiv Detail & Related papers (2023-08-23T10:08:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.