Amortized Auto-Tuning: Cost-Efficient Transfer Optimization for
Hyperparameter Recommendation
- URL: http://arxiv.org/abs/2106.09179v1
- Date: Thu, 17 Jun 2021 00:01:18 GMT
- Title: Amortized Auto-Tuning: Cost-Efficient Transfer Optimization for
Hyperparameter Recommendation
- Authors: Yuxin Xiao, Eric P. Xing, Willie Neiswanger
- Abstract summary: We propose an instantiation--amortized auto-tuning (AT2) to speed up tuning of machine learning models.
We conduct a thorough analysis of the multi-task multi-fidelity Bayesian optimization framework, which leads to the best instantiation--amortized auto-tuning (AT2)
- Score: 83.85021205445662
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the surge in the number of hyperparameters and training times of modern
machine learning models, hyperparameter tuning is becoming increasingly
expensive. Although methods have been proposed to speed up tuning via knowledge
transfer, they typically require the final performance of hyperparameters and
do not focus on low-fidelity information. Nevertheless, this common practice is
suboptimal and can incur an unnecessary use of resources. It is more
cost-efficient to instead leverage the low-fidelity tuning observations to
measure inter-task similarity and transfer knowledge from existing to new tasks
accordingly. However, performing multi-fidelity tuning comes with its own
challenges in the transfer setting: the noise in the additional observations
and the need for performance forecasting. Therefore, we conduct a thorough
analysis of the multi-task multi-fidelity Bayesian optimization framework,
which leads to the best instantiation--amortized auto-tuning (AT2). We further
present an offline-computed 27-task hyperparameter recommendation (HyperRec)
database to serve the community. Extensive experiments on HyperRec and other
real-world databases illustrate the effectiveness of our AT2 method.
Related papers
- Cost-Sensitive Multi-Fidelity Bayesian Optimization with Transfer of Learning Curve Extrapolation [55.75188191403343]
We introduce utility, which is a function predefined by each user and describes the trade-off between cost and performance of BO.
We validate our algorithm on various LC datasets and found it outperform all the previous multi-fidelity BO and transfer-BO baselines we consider.
arXiv Detail & Related papers (2024-05-28T07:38:39Z) - Trajectory-Based Multi-Objective Hyperparameter Optimization for Model Retraining [8.598456741786801]
We present a novel trajectory-based multi-objective Bayesian optimization algorithm.
Our algorithm outperforms the state-of-the-art multi-objectives in both locating better trade-offs and tuning efficiency.
arXiv Detail & Related papers (2024-05-24T07:43:45Z) - Federated Learning of Large Language Models with Parameter-Efficient
Prompt Tuning and Adaptive Optimization [71.87335804334616]
Federated learning (FL) is a promising paradigm to enable collaborative model training with decentralized data.
The training process of Large Language Models (LLMs) generally incurs the update of significant parameters.
This paper proposes an efficient partial prompt tuning approach to improve performance and efficiency simultaneously.
arXiv Detail & Related papers (2023-10-23T16:37:59Z) - An Empirical Analysis of Parameter-Efficient Methods for Debiasing
Pre-Trained Language Models [55.14405248920852]
We conduct experiments with prefix tuning, prompt tuning, and adapter tuning on different language models and bias types to evaluate their debiasing performance.
We find that the parameter-efficient methods are effective in mitigating gender bias, where adapter tuning is consistently the most effective.
We also find that prompt tuning is more suitable for GPT-2 than BERT, and racial and religious bias is less effective when it comes to racial and religious bias.
arXiv Detail & Related papers (2023-06-06T23:56:18Z) - A Framework for History-Aware Hyperparameter Optimisation in
Reinforcement Learning [8.659973888018781]
A Reinforcement Learning (RL) system depends on a set of initial conditions that affect the system's performance.
We propose a framework based on integrating complex event processing and temporal models, to alleviate these trade-offs.
We tested the proposed approach in a 5G mobile communications case study that uses DQN, a variant of RL, for its decision-making.
arXiv Detail & Related papers (2023-03-09T11:30:40Z) - Hyper-Parameter Auto-Tuning for Sparse Bayesian Learning [72.83293818245978]
We design and learn a neural network (NN)-based auto-tuner for hyper- parameter tuning in sparse Bayesian learning.
We show that considerable improvement in convergence rate and recovery performance can be achieved.
arXiv Detail & Related papers (2022-11-09T12:34:59Z) - Improving Multi-fidelity Optimization with a Recurring Learning Rate for
Hyperparameter Tuning [7.591442522626255]
We propose Multi-fidelity Optimization with a Recurring Learning rate (MORL)
MORL incorporates CNNs' optimization process into multi-fidelity optimization.
It alleviates the problem of slow-starter and achieves a more precise low-fidelity approximation.
arXiv Detail & Related papers (2022-09-26T08:16:31Z) - AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient
Hyper-parameter Tuning [72.54359545547904]
We propose a gradient-based subset selection framework for hyper- parameter tuning.
We show that using gradient-based data subsets for hyper- parameter tuning achieves significantly faster turnaround times and speedups of 3$times$-30$times$.
arXiv Detail & Related papers (2022-03-15T19:25:01Z) - Weighting Is Worth the Wait: Bayesian Optimization with Importance
Sampling [34.67740033646052]
We improve upon Bayesian optimization state-of-the-art runtime and final validation error across a variety of datasets and complex neural architectures.
By learning a parameterization of IS that trades-off evaluation complexity and quality, we improve upon Bayesian optimization state-of-the-art runtime and final validation error across a variety of datasets and complex neural architectures.
arXiv Detail & Related papers (2020-02-23T15:52:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.