Partial Fine-Tuning: A Successor to Full Fine-Tuning for Vision
Transformers
- URL: http://arxiv.org/abs/2312.15681v1
- Date: Mon, 25 Dec 2023 10:11:34 GMT
- Title: Partial Fine-Tuning: A Successor to Full Fine-Tuning for Vision
Transformers
- Authors: Peng Ye, Yongqi Huang, Chongjun Tu, Minglei Li, Tao Chen, Tong He,
Wanli Ouyang
- Abstract summary: We show that Partial Fine-Tuning can be an innovative and promising direction capable of concurrently enhancing both efficiency and accuracy.
We propose a novel fine-tuned angle metric to guide the selection of appropriate layers for partial fine-tuning.
Comprehensive experiments on a wide range of datasets and models validate the great potential of partial fine-tuning.
- Score: 50.23439411530435
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Fine-tuning pre-trained foundation models has gained significant popularity
in various research fields. Existing methods for fine-tuning can be roughly
divided into two categories, namely Parameter-Efficient Fine-Tuning and
High-Performance Fine-Tuning. The former aims at improving efficiency, while
the latter focuses on enhancing performance. Beyond these methods, we
demonstrate that Partial Fine-Tuning can be an innovative and promising
direction capable of concurrently enhancing both efficiency and accuracy. We
first validate eight manually-defined partial fine-tuning strategies across
kinds of datasets and vision transformer architectures, and find that some
partial fine-tuning strategies (e.g., ffn only or attention only) can achieve
better performance with fewer tuned parameters than full fine-tuning, and
selecting appropriate layers is critical to partial fine-tuning. Thus, we
propose a novel fine-tuned angle metric to guide the selection of appropriate
layers for partial fine-tuning, making it flexible to be adapted to various
scenarios for more practicable partial fine-tuning. Additionally, we show that
partial fine-tuning can serve as a new dimension for Model Soups, improving
both the model performance and generalization with fewer tuned parameters.
Comprehensive experiments on a wide range of datasets and models validate the
great potential of partial fine-tuning.
Related papers
- Derivative-Free Optimization for Low-Rank Adaptation in Large Language
Models [4.926283917321645]
We propose a derivative-free optimization method to eschew the computation of gradients and showcase an augmented level of robustness in few-shot settings.
Our proposed method achieves substantial improvement and exhibits clear advantages in memory usage and convergence speed compared to existing gradient-based parameter-efficient tuning and derivative-free optimization methods in few-shot settings.
arXiv Detail & Related papers (2024-03-04T06:20:31Z) - E^2VPT: An Effective and Efficient Approach for Visual Prompt Tuning [55.50908600818483]
Fine-tuning large-scale pretrained vision models for new tasks has become increasingly parameter-intensive.
We propose an Effective and Efficient Visual Prompt Tuning (E2VPT) approach for large-scale transformer-based model adaptation.
Our approach outperforms several state-of-the-art baselines on two benchmarks.
arXiv Detail & Related papers (2023-07-25T19:03:21Z) - Visual Tuning [143.43997336384126]
Fine-tuning visual models has been widely shown promising performance on many downstream visual tasks.
Recent advances can achieve superior performance than full-tuning the whole pre-trained parameters.
This survey characterizes a large and thoughtful selection of recent works, providing a systematic and comprehensive overview of work and models.
arXiv Detail & Related papers (2023-05-10T11:26:36Z) - Rethinking Efficient Tuning Methods from a Unified Perspective [34.67645496324432]
We revisit the design paradigm of PETL and derive a unified framework U-Tuning for parameter-efficient transfer learning.
The U-Tuning framework can simultaneously encompass existing methods and derive new approaches for parameter-efficient transfer learning.
arXiv Detail & Related papers (2023-03-01T17:38:03Z) - Prototypical Fine-tuning: Towards Robust Performance Under Varying Data
Sizes [47.880781811936345]
We propose a novel framework for fine-tuning pretrained language models (LM)
Our prototypical fine-tuning approach can automatically adjust the model capacity according to the number of data points and the model's inherent attributes.
arXiv Detail & Related papers (2022-11-24T14:38:08Z) - Scaling & Shifting Your Features: A New Baseline for Efficient Model
Tuning [126.84770886628833]
Existing finetuning methods either tune all parameters of the pretrained model (full finetuning) or only tune the last linear layer (linear probing)
We propose a new parameter-efficient finetuning method termed as SSF, representing that researchers only need to Scale and Shift the deep Features extracted by a pre-trained model to catch up with the performance full finetuning.
arXiv Detail & Related papers (2022-10-17T08:14:49Z) - Visual Prompt Tuning [74.5309408185523]
This paper introduces Visual Prompt Tuning (VPT) as an efficient and effective alternative to full fine-tuning for large-scale Transformer models in vision.
VPT introduces only a small amount (less than 1% of model parameters) of trainable parameters in the input space while keeping the model backbone frozen.
arXiv Detail & Related papers (2022-03-23T01:17:16Z) - Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for
Pre-trained Language Models [90.24999406296867]
In contrast with the standard fine-tuning, delta tuning only fine-tunes a small portion of the model parameters while keeping the rest untouched.
Recent studies have demonstrated that a series of delta tuning methods with distinct tuned parameter selection could achieve performance on a par with full- parameter fine-tuning.
arXiv Detail & Related papers (2022-03-14T07:56:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.