Related papers: PRILoRA: Pruned and Rank-Increasing Low-Rank Adaptation

PRILoRA: Pruned and Rank-Increasing Low-Rank Adaptation

URL: http://arxiv.org/abs/2401.11316v1
Date: Sat, 20 Jan 2024 20:25:17 GMT
Title: PRILoRA: Pruned and Rank-Increasing Low-Rank Adaptation
Authors: Nadav Benedek, Lior Wolf
Abstract summary: We introduce PRILoRA, which linearly allocates a different rank for each layer, in an increasing manner, and performs pruning throughout the training process. We validate the effectiveness of PRILoRA through extensive experiments on eight GLUE benchmarks, setting a new state of the art.
Score: 65.268245109828
License: http://creativecommons.org/licenses/by/4.0/
Abstract: With the proliferation of large pre-trained language models (PLMs), fine-tuning all model parameters becomes increasingly inefficient, particularly when dealing with numerous downstream tasks that entail substantial training and storage costs. Several approaches aimed at achieving parameter-efficient fine-tuning (PEFT) have been proposed. Among them, Low-Rank Adaptation (LoRA) stands out as an archetypal method, incorporating trainable rank decomposition matrices into each target module. Nevertheless, LoRA does not consider the varying importance of each layer. To address these challenges, we introduce PRILoRA, which linearly allocates a different rank for each layer, in an increasing manner, and performs pruning throughout the training process, considering both the temporary magnitude of weights and the accumulated statistics of the input to any given layer. We validate the effectiveness of PRILoRA through extensive experiments on eight GLUE benchmarks, setting a new state of the art.

Related papers

ElaLoRA: Elastic & Learnable Low-Rank Adaptation for Efficient Model Fine-Tuning [6.657174308208715]
ElaLoRA is an adaptive low-rank adaptation framework that dynamically prunes and expands ranks based on gradient-derived importance scores. ElaLoRA consistently outperforms existing PEFT methods across different parameter budgets. By introducing a principled and adaptive rank allocation mechanism, ElaLoRA offers a scalable and efficient fine-tuning solution.
arXiv Detail & Related papers (2025-03-31T21:58:25Z)
BeamLoRA: Beam-Constraint Low-Rank Adaptation [51.52097743781401]
Low-Rank Adaptation (LoRA) has been widely adopted as one of the most effective parameter-efficient fine-tuning methods. We propose BeamLoRA, which conceptualizes each LoRA module as a beam where each rank naturally corresponds to a potential sub-solution.
arXiv Detail & Related papers (2025-02-19T10:33:22Z)
SD-LoRA: Scalable Decoupled Low-Rank Adaptation for Class Incremental Learning [73.93639228235622]
Continual Learning with foundation models has emerged as a promising paradigm to exploit abundant knowledge acquired during pre-training for tackling sequential tasks. Existing prompt-based and Low-Rank Adaptation-based (LoRA-based) methods often require expanding a prompt/LoRA pool or retaining samples of previous tasks. We propose Scalable Decoupled LoRA (SD-LoRA) for class incremental learning, which continually separates the learning of the magnitude and direction of LoRA components without rehearsal.
arXiv Detail & Related papers (2025-01-22T20:00:41Z)
Less is More: Extreme Gradient Boost Rank-1 Adaption for Efficient Finetuning of LLMs [75.11449420928139]
Fine-tuning Large Language Models (LLMs) has become a crucial technique for adapting pre-trained models to downstream tasks. Low-Rank Adaptation (LoRA) has emerged as a promising solution, but there exists a gap between the practical performance of low-rank adaptations and its theoretical optimum. We propose eXtreme Gradient Boosting LoRA, a novel framework that bridges this gap by leveraging the power of ensemble learning.
arXiv Detail & Related papers (2024-10-25T17:07:13Z)
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning [105.11844150736536]
Low-rank adaptation is a popular parameter-efficient fine-tuning method for large language models. We propose a new method called MoRA, which employs a square matrix to achieve high-rank updating while maintaining the same number of trainable parameters. Our method outperforms LoRA on memory-intensive tasks and achieves comparable performance on other tasks.
arXiv Detail & Related papers (2024-05-20T15:48:32Z)
ALoRA: Allocating Low-Rank Adaptation for Fine-tuning Large Language Models [8.251547772610301]
We extend the methodology of low-rank adaptation (LoRA) to an innovative approach we call allocating low-rank adaptation (ALoRA) First, we propose a novel method, AB-LoRA, that can effectively estimate the importance score of each LoRA rank. Second, guided by AB-LoRA, we gradually prune abundant and negatively impacting LoRA ranks and allocate the pruned LoRA budgets to important Transformer modules needing higher ranks.
arXiv Detail & Related papers (2024-03-24T15:09:55Z)
BiLoRA: A Bi-level Optimization Framework for Overfitting-Resilient Low-Rank Adaptation of Large Pre-trained Models [34.1111413429869]
BiLoRA is an overfitting-alleviating fine-tuning approach based on bi-level optimization (BLO) tested on ten datasets covering natural language understanding and generation tasks.
arXiv Detail & Related papers (2024-03-19T14:11:20Z)
AutoLoRA: Automatically Tuning Matrix Ranks in Low-Rank Adaptation Based on Meta Learning [31.975038164401404]
Low-rank adaptation (LoRA) finetunes low-rank incremental update matrices on top of frozen pretrained weights. We introduce AutoLoRA, a framework for automatically identifying the optimal rank of each LoRA layer. Our experiments on natural language understanding, generation, and sequence labeling demonstrate the effectiveness of AutoLoRA.
arXiv Detail & Related papers (2024-03-14T05:29:35Z)
MELoRA: Mini-Ensemble Low-Rank Adapters for Parameter-Efficient Fine-Tuning [71.50432879573614]
Low-rank adaptation (LoRA) is based on the idea that the adaptation process is intrinsically low-dimensional. We present MELoRA, a mini-ensemble low-rank adapters that uses fewer trainable parameters while maintaining a higher rank. Our experimental results show that, compared to LoRA, MELoRA achieves better performance with 8 times fewer trainable parameters on natural language understanding tasks and 36 times fewer trainable parameters on instruction following tasks.
arXiv Detail & Related papers (2024-02-27T07:14:12Z)
Sparse Low-rank Adaptation of Pre-trained Language Models [79.74094517030035]
We introduce sparse low-rank adaptation (SoRA) that enables dynamic adjustments to the intrinsic rank during the adaptation process. Our approach strengthens the representation power of LoRA by initializing it with a higher rank, while efficiently taming a temporarily increased number of parameters. Our experimental results demonstrate that SoRA can outperform other baselines even with 70% retained parameters and 70% training time.
arXiv Detail & Related papers (2023-11-20T11:56:25Z)
IncreLoRA: Incremental Parameter Allocation Method for Parameter-Efficient Fine-tuning [15.964205804768163]
IncreLoRA is an incremental parameter allocation method that adaptively adds trainable parameters during training. We conduct extensive experiments on GLUE to demonstrate the effectiveness of IncreLoRA.
arXiv Detail & Related papers (2023-08-23T10:08:10Z)
AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning [143.23123791557245]
Fine-tuning large pre-trained language models on downstream tasks has become an important paradigm in NLP. We propose AdaLoRA, which adaptively allocates the parameter budget among weight matrices according to their importance score. We conduct extensive experiments with several pre-trained models on natural language processing, question answering, and natural language generation to validate the effectiveness of AdaLoRA.
arXiv Detail & Related papers (2023-03-18T22:36:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.