PELA: Learning Parameter-Efficient Models with Low-Rank Approximation
- URL: http://arxiv.org/abs/2310.10700v2
- Date: Fri, 17 Nov 2023 06:41:01 GMT
- Title: PELA: Learning Parameter-Efficient Models with Low-Rank Approximation
- Authors: Yangyang Guo and Guangzhi Wang and Mohan Kankanhalli
- Abstract summary: We propose a novel method for increasing the parameter efficiency of pre-trained models by introducing an intermediate pre-training stage.
This allows for direct and efficient utilization of the low-rank model for downstream fine-tuning tasks.
- Score: 16.9278983497498
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Applying a pre-trained large model to downstream tasks is prohibitive under
resource-constrained conditions. Recent dominant approaches for addressing
efficiency issues involve adding a few learnable parameters to the fixed
backbone model. This strategy, however, leads to more challenges in loading
large models for downstream fine-tuning with limited resources. In this paper,
we propose a novel method for increasing the parameter efficiency of
pre-trained models by introducing an intermediate pre-training stage. To this
end, we first employ low-rank approximation to compress the original large
model and then devise a feature distillation module and a weight perturbation
regularization module. These modules are specifically designed to enhance the
low-rank model. In particular, we update only the low-rank model while freezing
the backbone parameters during pre-training. This allows for direct and
efficient utilization of the low-rank model for downstream fine-tuning tasks.
The proposed method achieves both efficiencies in terms of required parameters
and computation time while maintaining comparable results with minimal
modifications to the backbone architecture. Specifically, when applied to three
vision-only and one vision-language Transformer models, our approach often
demonstrates a merely $\sim$0.6 point decrease in performance while reducing
the original parameter size by 1/3 to 2/3.
Related papers
- Meta-Learning Adaptable Foundation Models [37.458141335750696]
We introduce a meta-learning framework infused with PEFT in this intermediate retraining stage to learn a model that can be easily adapted to unseen tasks.
In this setting, we demonstrate the suboptimality of standard retraining for finding an adaptable set of parameters.
We then apply these theoretical insights to retraining the RoBERTa model to predict the continuation of conversations within the ConvAI2 dataset.
arXiv Detail & Related papers (2024-10-29T17:24:18Z) - LoRTA: Low Rank Tensor Adaptation of Large Language Models [70.32218116940393]
Low Rank Adaptation (LoRA) is a popular Efficient Fine Tuning (PEFT) method that effectively adapts large pre-trained models for downstream tasks.
We propose a novel approach that employs a low rank tensor parametrization for model updates.
Our method is both efficient and effective for fine-tuning large language models, achieving a substantial reduction in the number of parameters while maintaining comparable performance.
arXiv Detail & Related papers (2024-10-05T06:59:50Z) - SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation [52.6922833948127]
In this work, we investigate the importance of parameters in pre-trained diffusion models.
We propose a novel model fine-tuning method to make full use of these ineffective parameters.
Our method enhances the generative capabilities of pre-trained models in downstream applications.
arXiv Detail & Related papers (2024-09-10T16:44:47Z) - SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction.
SMILE allows for the upscaling of source models into an MoE model without extra data or further training.
We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z) - E^2VPT: An Effective and Efficient Approach for Visual Prompt Tuning [55.50908600818483]
Fine-tuning large-scale pretrained vision models for new tasks has become increasingly parameter-intensive.
We propose an Effective and Efficient Visual Prompt Tuning (E2VPT) approach for large-scale transformer-based model adaptation.
Our approach outperforms several state-of-the-art baselines on two benchmarks.
arXiv Detail & Related papers (2023-07-25T19:03:21Z) - MoEfication: Conditional Computation of Transformer Models for Efficient
Inference [66.56994436947441]
Transformer-based pre-trained language models can achieve superior performance on most NLP tasks due to large parameter capacity, but also lead to huge computation cost.
We explore to accelerate large-model inference by conditional computation based on the sparse activation phenomenon.
We propose to transform a large model into its mixture-of-experts (MoE) version with equal model size, namely MoEfication.
arXiv Detail & Related papers (2021-10-05T02:14:38Z) - Dynamic Model Pruning with Feedback [64.019079257231]
We propose a novel model compression method that generates a sparse trained model without additional overhead.
We evaluate our method on CIFAR-10 and ImageNet, and show that the obtained sparse models can reach the state-of-the-art performance of dense models.
arXiv Detail & Related papers (2020-06-12T15:07:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.