Sparse Structure Search for Parameter-Efficient Tuning
- URL: http://arxiv.org/abs/2206.07382v1
- Date: Wed, 15 Jun 2022 08:45:21 GMT
- Title: Sparse Structure Search for Parameter-Efficient Tuning
- Authors: Shengding Hu, Zhen Zhang, Ning Ding, Yadao Wang, Yasheng Wang, Zhiyuan
Liu, Maosong Sun
- Abstract summary: We show that S$3$PET surpasses manual and random structures with less trainable parameters.
The searched structures preserve more than 99% fine-tuning performance with 0.01% trainable parameters.
- Score: 85.49094523664428
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Adapting large pre-trained models (PTMs) through fine-tuning imposes
prohibitive computational and storage burdens. Recent studies of
parameter-efficient tuning (PET) find that only optimizing a small portion of
parameters conditioned on PTMs could yield on-par performance compared to
conventional fine-tuning. Generally, PET methods exquisitely design
parameter-efficient modules (PET modules) which could be applied to arbitrary
fine-grained positions inside PTMs. However, the effectiveness of these
fine-grained positions largely relies on sophisticated manual designation,
thereby usually producing sub-optimal results. In contrast to the manual
designation, we explore constructing PET modules in an automatic manner. We
automatically \textbf{S}earch for the \textbf{S}parse \textbf{S}tructure of
\textbf{P}arameter-\textbf{E}fficient \textbf{T}uning (S$^3$PET). Based on a
unified framework of various PET methods, S$^3$PET conducts the differentiable
PET structure search through bi-level optimization and proposes shifted global
sigmoid method to explicitly control the number of trainable parameters.
Extensive experiments show that S$^3$PET surpasses manual and random structures
with less trainable parameters. The searched structures preserve more than 99\%
fine-tuning performance with 0.01\% trainable parameters. Moreover, the
advantage of S$^3$PET is amplified with extremely low trainable parameters
budgets (0.0009\%$\sim$0.01\%). The searched structures are transferable and
explainable, providing suggestions and guidance for the future design of PET
methods.
Related papers
- ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane Reflections [59.839926875976225]
We propose the ETHER transformation family, which performs Efficient fineTuning via HypErplane Reflections.
In particular, we introduce ETHER and its relaxation ETHER+, which match or outperform existing PEFT methods with significantly fewer parameters.
arXiv Detail & Related papers (2024-05-30T17:26:02Z) - ConPET: Continual Parameter-Efficient Tuning for Large Language Models [65.48107393731861]
Continual learning requires continual adaptation of models to newly emerging tasks.
We propose Continual.
Efficient Tuning (ConPET), a generalizable paradigm for.
continual task adaptation of large language models.
arXiv Detail & Related papers (2023-09-26T08:52:04Z) - Exploring the Impact of Model Scaling on Parameter-Efficient Tuning [100.61202305296275]
Scaling-efficient tuning (PET) methods can effectively drive extremely large pre-trained language models (PLMs)
In small PLMs, there are usually noticeable performance differences among PET methods.
We introduce a more flexible PET method called Arbitrary PET (APET) method.
arXiv Detail & Related papers (2023-06-04T10:10:54Z) - Stochastic Bridges as Effective Regularizers for Parameter-Efficient
Tuning [98.27893964124829]
We propose regularizing PETs that use bridges as the regularizers (running costs) for the intermediate states.
In view of the great potential and capacity, we believe more sophisticated regularizers can be designed for PETs.
arXiv Detail & Related papers (2023-05-28T09:22:44Z) - Parameter-Efficient Fine-Tuning without Introducing New Latency [7.631596468553607]
We introduce a novel adapter technique that directly applies the adapter to pre-trained parameters instead of the hidden representation.
Our proposed method attains a new state-of-the-art outcome in terms of both performance and storage efficiency, storing only 0.03% parameters of full fine-tuning.
arXiv Detail & Related papers (2023-05-26T08:44:42Z) - Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning [91.5113227694443]
We propose a novel visual.
sensuous-aware fine-Tuning (SPT) scheme.
SPT allocates trainable parameters to task-specific important positions.
Experiments on a wide range of downstream recognition tasks show that our SPT is complementary to the existing PEFT methods.
arXiv Detail & Related papers (2023-03-15T12:34:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.