Stochastic Bridges as Effective Regularizers for Parameter-Efficient
Tuning
- URL: http://arxiv.org/abs/2305.17670v1
- Date: Sun, 28 May 2023 09:22:44 GMT
- Title: Stochastic Bridges as Effective Regularizers for Parameter-Efficient
Tuning
- Authors: Weize Chen, Xu Han, Yankai Lin, Zhiyuan Liu, Maosong Sun, Jie Zhou
- Abstract summary: We propose regularizing PETs that use bridges as the regularizers (running costs) for the intermediate states.
In view of the great potential and capacity, we believe more sophisticated regularizers can be designed for PETs.
- Score: 98.27893964124829
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Parameter-efficient tuning methods (PETs) have achieved promising results in
tuning large pre-trained language models (PLMs). By formalizing frozen PLMs and
additional tunable parameters as systems and controls respectively, PETs can be
theoretically grounded to optimal control and further viewed as optimizing the
terminal cost and running cost in the optimal control literature. Despite the
elegance of this theoretical grounding, in practice, existing PETs often ignore
the running cost and only optimize the terminal cost, i.e., focus on optimizing
the loss function of the output state, regardless of the running cost that
depends on the intermediate states. Since it is non-trivial to directly model
the intermediate states and design a running cost function, we propose to use
latent stochastic bridges to regularize the intermediate states and use the
regularization as the running cost of PETs. As the first work to propose
regularized PETs that use stochastic bridges as the regularizers (running
costs) for the intermediate states, we show the effectiveness and generality of
this regularization across different tasks, PLMs and PETs. In view of the great
potential and capacity, we believe more sophisticated regularizers can be
designed for PETs and better performance can be achieved in the future. The
code is released at
\url{https://github.com/thunlp/stochastic-bridge-pet/tree/main}.
Related papers
- ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane Reflections [59.839926875976225]
We propose the ETHER transformation family, which performs Efficient fineTuning via HypErplane Reflections.
In particular, we introduce ETHER and its relaxation ETHER+, which match or outperform existing PEFT methods with significantly fewer parameters.
arXiv Detail & Related papers (2024-05-30T17:26:02Z) - ConPET: Continual Parameter-Efficient Tuning for Large Language Models [65.48107393731861]
Continual learning requires continual adaptation of models to newly emerging tasks.
We propose Continual.
Efficient Tuning (ConPET), a generalizable paradigm for.
continual task adaptation of large language models.
arXiv Detail & Related papers (2023-09-26T08:52:04Z) - Exploring the Impact of Model Scaling on Parameter-Efficient Tuning [100.61202305296275]
Scaling-efficient tuning (PET) methods can effectively drive extremely large pre-trained language models (PLMs)
In small PLMs, there are usually noticeable performance differences among PET methods.
We introduce a more flexible PET method called Arbitrary PET (APET) method.
arXiv Detail & Related papers (2023-06-04T10:10:54Z) - Sparse Structure Search for Parameter-Efficient Tuning [85.49094523664428]
We show that S$3$PET surpasses manual and random structures with less trainable parameters.
The searched structures preserve more than 99% fine-tuning performance with 0.01% trainable parameters.
arXiv Detail & Related papers (2022-06-15T08:45:21Z) - Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than
In-Context Learning [81.3514358542452]
Few-shot in-context learning (ICL) incurs substantial computational, memory, and storage costs because it involves processing all of the training examples every time a prediction is made.
parameter-efficient fine-tuning offers an alternative paradigm where a small set of parameters are trained to enable a model to perform the new task.
In this paper, we rigorously compare few-shot ICL and parameter-efficient fine-tuning and demonstrate that the latter offers better accuracy as well as dramatically lower computational costs.
arXiv Detail & Related papers (2022-05-11T17:10:41Z) - Revisiting Parameter-Efficient Tuning: Are We Really There Yet? [33.13293845589329]
PETuning methods claim to have achieved performance on par with or better than finetuning.
We take a step back and re-examine these PETuning methods by conducting the first comprehensive investigation into the training and evaluation of PETuning methods.
arXiv Detail & Related papers (2022-02-16T10:11:19Z) - A Nonmyopic Approach to Cost-Constrained Bayesian Optimization [10.078368988372247]
We formulate cost-constrained BO as a constrained Markov decision process (CMDP)
We develop an efficient rollout approximation to the optimal CMDP policy that takes both the cost and future iterations into account.
arXiv Detail & Related papers (2021-06-10T22:44:37Z) - Pareto-efficient Acquisition Functions for Cost-Aware Bayesian
Optimization [5.459427541271035]
We show how to make cost-aware Bayesian optimization for black-box functions.
On 144 real-world black-box function optimization problems, our solution brings up to 50% speed-ups.
We also revisit the common choice of Gaussian process cost models, showing that simple, low-variance cost models predict training times effectively.
arXiv Detail & Related papers (2020-11-23T15:06:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.