Related papers: Amortized Auto-Tuning: Cost-Efficient Transfer Optimization for Hyperparameter Recommendation

Amortized Auto-Tuning: Cost-Efficient Transfer Optimization for Hyperparameter Recommendation

URL: http://arxiv.org/abs/2106.09179v1
Date: Thu, 17 Jun 2021 00:01:18 GMT
Title: Amortized Auto-Tuning: Cost-Efficient Transfer Optimization for Hyperparameter Recommendation
Authors: Yuxin Xiao, Eric P. Xing, Willie Neiswanger
Abstract summary: We propose an instantiation--amortized auto-tuning (AT2) to speed up tuning of machine learning models. We conduct a thorough analysis of the multi-task multi-fidelity Bayesian optimization framework, which leads to the best instantiation--amortized auto-tuning (AT2)
Score: 83.85021205445662
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the surge in the number of hyperparameters and training times of modern machine learning models, hyperparameter tuning is becoming increasingly expensive. Although methods have been proposed to speed up tuning via knowledge transfer, they typically require the final performance of hyperparameters and do not focus on low-fidelity information. Nevertheless, this common practice is suboptimal and can incur an unnecessary use of resources. It is more cost-efficient to instead leverage the low-fidelity tuning observations to measure inter-task similarity and transfer knowledge from existing to new tasks accordingly. However, performing multi-fidelity tuning comes with its own challenges in the transfer setting: the noise in the additional observations and the need for performance forecasting. Therefore, we conduct a thorough analysis of the multi-task multi-fidelity Bayesian optimization framework, which leads to the best instantiation--amortized auto-tuning (AT2). We further present an offline-computed 27-task hyperparameter recommendation (HyperRec) database to serve the community. Extensive experiments on HyperRec and other real-world databases illustrate the effectiveness of our AT2 method.

Related papers

High-Rank Structured Modulation for Parameter-Efficient Fine-Tuning [57.85676271833619]
Low-rank Adaptation (LoRA) uses a low-rank update method to simulate full parameter fine-tuning.<n>We present textbfSMoA, a high-rank textbfStructured textbfMOdulation textbfAdapter that uses fewer trainable parameters while maintaining a higher rank.
arXiv Detail & Related papers (2026-01-12T13:06:17Z)
Relation-Aware Bayesian Optimization of DBMS Configurations Guided by Affinity Scores [2.474203056060563]
Database Management Systems (DBMSs) are fundamental for managing large-scale and heterogeneous data, and their performance is critically influenced by configuration parameters.<n>Recent research has focused on automated configuration optimization using machine learning; however, existing approaches still exhibit several key limitations.<n>We propose RelTune, a novel framework that represents parameter dependencies as a Graph and learns GNN-based latent embeddings that encode performancerelevant semantics.
arXiv Detail & Related papers (2025-10-31T03:46:42Z)
Tune My Adam, Please! [42.01711296068661]
We propose Adam-PFN, a new surrogate model for Freeze-thaw BO of Adam's hyperparameters, pre-trained on learning curves from TaskSet.<n>Our approach improves both learning curve augmentation and hyperparameter optimization on TaskSet evaluation tasks, with strong performance on out-of-distribution tasks.
arXiv Detail & Related papers (2025-08-27T09:57:45Z)
Interim Report on Human-Guided Adaptive Hyperparameter Optimization with Multi-Fidelity Sprints [0.0]
This case study applies a phased hyperparameter optimization process to compare multitask natural language model variants.<n>We employ short, Bayesian optimization sessions that leverage multi-fidelity, hyperparameter space pruning, progressive halving, and a degree of human guidance.<n>We demonstrate our method on a collection of variants of the 2021 Joint Entity and Relation Extraction model proposed by Eberts and Ulges.
arXiv Detail & Related papers (2025-05-14T20:38:44Z)
Towards hyperparameter-free optimization with differential privacy [9.193537596304669]
Differential privacy (DP) is a privacy-preserving paradigm that protects the training data when training deep learning models. In this work, we adapt the automatic learning rate schedule to DP optimization for any models and achieves state-of-the-art DP performance on various language and vision tasks.
arXiv Detail & Related papers (2025-03-02T02:59:52Z)
Dynamic Noise Preference Optimization for LLM Self-Improvement via Synthetic Data [51.62162460809116]
We introduce Dynamic Noise Preference Optimization (DNPO) to ensure consistent improvements across iterations. In experiments with Zephyr-7B, DNPO consistently outperforms existing methods, showing an average performance boost of 2.6%. DNPO shows a significant improvement in model-generated data quality, with a 29.4% win-loss rate gap compared to the baseline in GPT-4 evaluations.
arXiv Detail & Related papers (2025-02-08T01:20:09Z)
Cost-Sensitive Multi-Fidelity Bayesian Optimization with Transfer of Learning Curve Extrapolation [55.75188191403343]
We introduce utility, which is a function predefined by each user and describes the trade-off between cost and performance of BO. We validate our algorithm on various LC datasets and found it outperform all the previous multi-fidelity BO and transfer-BO baselines we consider.
arXiv Detail & Related papers (2024-05-28T07:38:39Z)
Trajectory-Based Multi-Objective Hyperparameter Optimization for Model Retraining [8.598456741786801]
We present a novel trajectory-based multi-objective Bayesian optimization algorithm. Our algorithm outperforms the state-of-the-art multi-objectives in both locating better trade-offs and tuning efficiency.
arXiv Detail & Related papers (2024-05-24T07:43:45Z)
Federated Learning of Large Language Models with Parameter-Efficient Prompt Tuning and Adaptive Optimization [71.87335804334616]
Federated learning (FL) is a promising paradigm to enable collaborative model training with decentralized data. The training process of Large Language Models (LLMs) generally incurs the update of significant parameters. This paper proposes an efficient partial prompt tuning approach to improve performance and efficiency simultaneously.
arXiv Detail & Related papers (2023-10-23T16:37:59Z)
An Empirical Analysis of Parameter-Efficient Methods for Debiasing Pre-Trained Language Models [55.14405248920852]
We conduct experiments with prefix tuning, prompt tuning, and adapter tuning on different language models and bias types to evaluate their debiasing performance. We find that the parameter-efficient methods are effective in mitigating gender bias, where adapter tuning is consistently the most effective. We also find that prompt tuning is more suitable for GPT-2 than BERT, and racial and religious bias is less effective when it comes to racial and religious bias.
arXiv Detail & Related papers (2023-06-06T23:56:18Z)
A Framework for History-Aware Hyperparameter Optimisation in Reinforcement Learning [8.659973888018781]
A Reinforcement Learning (RL) system depends on a set of initial conditions that affect the system's performance. We propose a framework based on integrating complex event processing and temporal models, to alleviate these trade-offs. We tested the proposed approach in a 5G mobile communications case study that uses DQN, a variant of RL, for its decision-making.
arXiv Detail & Related papers (2023-03-09T11:30:40Z)
Hyper-Parameter Auto-Tuning for Sparse Bayesian Learning [72.83293818245978]
We design and learn a neural network (NN)-based auto-tuner for hyper- parameter tuning in sparse Bayesian learning. We show that considerable improvement in convergence rate and recovery performance can be achieved.
arXiv Detail & Related papers (2022-11-09T12:34:59Z)
Improving Multi-fidelity Optimization with a Recurring Learning Rate for Hyperparameter Tuning [7.591442522626255]
We propose Multi-fidelity Optimization with a Recurring Learning rate (MORL) MORL incorporates CNNs' optimization process into multi-fidelity optimization. It alleviates the problem of slow-starter and achieves a more precise low-fidelity approximation.
arXiv Detail & Related papers (2022-09-26T08:16:31Z)
AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient Hyper-parameter Tuning [72.54359545547904]
We propose a gradient-based subset selection framework for hyper- parameter tuning. We show that using gradient-based data subsets for hyper- parameter tuning achieves significantly faster turnaround times and speedups of 3$times$-30$times$.
arXiv Detail & Related papers (2022-03-15T19:25:01Z)
Weighting Is Worth the Wait: Bayesian Optimization with Importance Sampling [34.67740033646052]
We improve upon Bayesian optimization state-of-the-art runtime and final validation error across a variety of datasets and complex neural architectures. By learning a parameterization of IS that trades-off evaluation complexity and quality, we improve upon Bayesian optimization state-of-the-art runtime and final validation error across a variety of datasets and complex neural architectures.
arXiv Detail & Related papers (2020-02-23T15:52:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.