Straightforward Layer-wise Pruning for More Efficient Visual Adaptation
- URL: http://arxiv.org/abs/2407.14330v1
- Date: Fri, 19 Jul 2024 14:10:35 GMT
- Title: Straightforward Layer-wise Pruning for More Efficient Visual Adaptation
- Authors: Ruizi Han, Jinglei Tang,
- Abstract summary: We propose a Straightforward layer-wise pruning method, called SLS, for pruning PETL-transferred models.
Our study reveals that layer-wise pruning, with a focus on storing pruning indices, addresses storage volume concerns.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Parameter-efficient transfer learning (PETL) aims to adapt large pre-trained models using limited parameters. While most PETL approaches update the added parameters and freeze pre-trained weights during training, the minimal impact of task-specific deep layers on cross-domain data poses a challenge as PETL cannot modify them, resulting in redundant model structures. Structural pruning effectively reduces model redundancy; however, common pruning methods often lead to an excessive increase in stored parameters due to varying pruning structures based on pruning rates and data. Recognizing the storage parameter volume issue, we propose a Straightforward layer-wise pruning method, called SLS, for pruning PETL-transferred models. By evaluating parameters from a feature perspective of each layer and utilizing clustering metrics to assess current parameters based on clustering phenomena in low-dimensional space obtained through t-SNE, SLS facilitates informed pruning decisions. Our study reveals that layer-wise pruning, with a focus on storing pruning indices, addresses storage volume concerns. Notably, mainstream Layer-wise pruning methods may not be suitable for assessing layer importance in PETL-transferred models, where the majority of parameters are pre-trained and have limited relevance to downstream datasets. Comparative analysis against state-of-the-art PETL methods demonstrates that the pruned model achieved a notable balance between model throughput and accuracy. Moreover, SLS effectively reduces storage overhead arising from varying pruned structures while enhancing the accuracy and speed of pruned models compared to conventional pruning methods.
Related papers
- LESA: Learnable LLM Layer Scaling-Up [57.0510934286449]
Training Large Language Models (LLMs) from scratch requires immense computational resources, making it prohibitively expensive.
Model scaling-up offers a promising solution by leveraging the parameters of smaller models to create larger ones.
We propose textbfLESA, a novel learnable method for depth scaling-up.
arXiv Detail & Related papers (2025-02-19T14:58:48Z) - DReSS: Data-driven Regularized Structured Streamlining for Large Language Models [30.47317140878219]
Large language models (LLMs) have achieved significant progress across various domains, but their increasing scale results in high computational and memory costs.
We propose a novel paradigm that first applies regularization, then prunes, and finally finetunes.
By leveraging a small amount of data to regularize the components to be pruned, DReSS explicitly transfers the important information to the remaining parts of the model in advance.
arXiv Detail & Related papers (2025-01-29T14:28:11Z) - LoRTA: Low Rank Tensor Adaptation of Large Language Models [70.32218116940393]
Low Rank Adaptation (LoRA) is a popular Efficient Fine Tuning (PEFT) method.
We propose a higher-order Candecomp/Parafac (CP) decomposition, enabling a more compact and flexible representation.
Our method can achieve a reduction in the number of parameters while maintaining comparable performance.
arXiv Detail & Related papers (2024-10-05T06:59:50Z) - SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation [52.6922833948127]
In this work, we investigate the importance of parameters in pre-trained diffusion models.
We propose a novel model fine-tuning method to make full use of these ineffective parameters.
Our method enhances the generative capabilities of pre-trained models in downstream applications.
arXiv Detail & Related papers (2024-09-10T16:44:47Z) - SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction.
SMILE allows for the upscaling of source models into an MoE model without extra data or further training.
We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z) - SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning [63.93193829913252]
We propose an innovative METL strategy called SHERL for resource-limited scenarios.
In the early route, intermediate outputs are consolidated via an anti-redundancy operation.
In the late route, utilizing minimal late pre-trained layers could alleviate the peak demand on memory overhead.
arXiv Detail & Related papers (2024-07-10T10:22:35Z) - Rethinking Efficient Tuning Methods from a Unified Perspective [34.67645496324432]
We revisit the design paradigm of PETL and derive a unified framework U-Tuning for parameter-efficient transfer learning.
The U-Tuning framework can simultaneously encompass existing methods and derive new approaches for parameter-efficient transfer learning.
arXiv Detail & Related papers (2023-03-01T17:38:03Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.