Data-oriented Dynamic Fine-tuning Parameter Selection Strategy for FISH
Mask based Efficient Fine-tuning
- URL: http://arxiv.org/abs/2403.08484v1
- Date: Wed, 13 Mar 2024 12:50:23 GMT
- Title: Data-oriented Dynamic Fine-tuning Parameter Selection Strategy for FISH
Mask based Efficient Fine-tuning
- Authors: Ming Dong, Kang Xue, Bolong Zheng, Tingting He
- Abstract summary: We propose an IRD algorithm to search the best setting of sample- parameter pair for FISH Mask.
We demonstrate the effectiveness and rationality of proposed strategy by conducting experiments on GLUE benchmark.
- Score: 9.423534576254712
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In view of the huge number of parameters of Large language models (LLMs) ,
tuning all parameters is very costly, and accordingly fine-tuning specific
parameters is more sensible. Most of parameter efficient fine-tuning (PEFT)
concentrate on parameter selection strategies, such as additive method,
selective method and reparametrization-based method. However, there are few
methods that consider the impact of data samples on parameter selecting, such
as Fish Mask based method. Fish Mask randomly choose a part of data samples and
treat them equally during parameter selection, which is unable to dynamically
select optimal parameters for inconstant data distributions. In this work, we
adopt a data-oriented perspective, then proposing an IRD ($\mathrm{\underline
I}$terative sample-parameter $\mathrm{\underline R}$ange $\mathrm{\underline
D}$ecreasing) algorithm to search the best setting of sample-parameter pair for
FISH Mask. In each iteration, by searching the set of samples and parameters
with larger Fish information, IRD can find better sample-parameter pair in most
scale. We demonstrate the effectiveness and rationality of proposed strategy by
conducting experiments on GLUE benchmark. Experimental results show our
strategy optimizes the parameter selection and achieves preferable performance.
Related papers
- Scaling Exponents Across Parameterizations and Optimizers [94.54718325264218]
We propose a new perspective on parameterization by investigating a key assumption in prior work.
Our empirical investigation includes tens of thousands of models trained with all combinations of threes.
We find that the best learning rate scaling prescription would often have been excluded by the assumptions in prior work.
arXiv Detail & Related papers (2024-07-08T12:32:51Z) - Adaptive Preference Scaling for Reinforcement Learning with Human Feedback [103.36048042664768]
Reinforcement learning from human feedback (RLHF) is a prevalent approach to align AI systems with human values.
We propose a novel adaptive preference loss, underpinned by distributionally robust optimization (DRO)
Our method is versatile and can be readily adapted to various preference optimization frameworks.
arXiv Detail & Related papers (2024-06-04T20:33:22Z) - Efficient and Robust Bayesian Selection of Hyperparameters in Dimension
Reduction for Visualization [0.0]
We introduce an efficient and robust auto-tuning framework for hyper parameter selection in dimension reduction (DR) algorithms.
Our approach enables efficient hyper parameter selection with multi-objective trade-offs and allows us to perform data-driven analysis.
We evaluate our results on various synthetic and real-world datasets using multiple quality metrics.
arXiv Detail & Related papers (2023-06-01T05:36:22Z) - Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning [91.5113227694443]
We propose a novel visual.
sensuous-aware fine-Tuning (SPT) scheme.
SPT allocates trainable parameters to task-specific important positions.
Experiments on a wide range of downstream recognition tasks show that our SPT is complementary to the existing PEFT methods.
arXiv Detail & Related papers (2023-03-15T12:34:24Z) - On the Effectiveness of Parameter-Efficient Fine-Tuning [79.6302606855302]
Currently, many research works propose to only fine-tune a small portion of the parameters while keeping most of the parameters shared across different tasks.
We show that all of the methods are actually sparse fine-tuned models and conduct a novel theoretical analysis of them.
Despite the effectiveness of sparsity grounded by our theory, it still remains an open problem of how to choose the tunable parameters.
arXiv Detail & Related papers (2022-11-28T17:41:48Z) - Sparse high-dimensional linear regression with a partitioned empirical
Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression.
Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates.
The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z) - AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient
Hyper-parameter Tuning [72.54359545547904]
We propose a gradient-based subset selection framework for hyper- parameter tuning.
We show that using gradient-based data subsets for hyper- parameter tuning achieves significantly faster turnaround times and speedups of 3$times$-30$times$.
arXiv Detail & Related papers (2022-03-15T19:25:01Z) - Hyperparameter Selection for Subsampling Bootstraps [0.0]
A subsampling method like BLB serves as a powerful tool for assessing the quality of estimators for massive data.
The performance of the subsampling methods are highly influenced by the selection of tuning parameters.
We develop a hyperparameter selection methodology, which can be used to select tuning parameters for subsampling methods.
Both simulation studies and real data analysis demonstrate the superior advantage of our method.
arXiv Detail & Related papers (2020-06-02T17:10:45Z) - PHS: A Toolbox for Parallel Hyperparameter Search [2.0305676256390934]
We introduce an open source python framework named PHS - Parallel Hyperparameter Search.
It enables hyperparameter optimization on numerous compute instances of any arbitrary python function.
arXiv Detail & Related papers (2020-02-26T12:17:54Z) - Online Parameter Estimation for Safety-Critical Systems with Gaussian
Processes [6.122161391301866]
We present a Bayesian optimization framework based on Gaussian processes (GPs) for online parameter estimation.
It uses an efficient search strategy over a response surface in the parameter space for finding the global optima with minimal function evaluations.
We demonstrate our technique on an actuated planar pendulum and safety-critical quadrotor in simulation with changing parameters.
arXiv Detail & Related papers (2020-02-18T20:38:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.