Related papers: Multi-level Training and Bayesian Optimization for Economical Hyperparameter Optimization

Multi-level Training and Bayesian Optimization for Economical Hyperparameter Optimization

URL: http://arxiv.org/abs/2007.09953v1
Date: Mon, 20 Jul 2020 09:03:02 GMT
Title: Multi-level Training and Bayesian Optimization for Economical Hyperparameter Optimization
Authors: Yang Yang, Ke Deng, Michael Zhu
Abstract summary: In this paper, we develop an effective approach to reducing the total amount of required training time for Hyperparameter Optimization. We propose a truncated additive Gaussian process model to calibrate approximate performance measurements generated by light training. Based on the model, a sequential model-based algorithm is developed to generate the performance profile of the configuration space as well as find optimal ones.
Score: 12.92634461859467
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Hyperparameters play a critical role in the performances of many machine learning methods. Determining their best settings or Hyperparameter Optimization (HPO) faces difficulties presented by the large number of hyperparameters as well as the excessive training time. In this paper, we develop an effective approach to reducing the total amount of required training time for HPO. In the initialization, the nested Latin hypercube design is used to select hyperparameter configurations for two types of training, which are, respectively, heavy training and light training. We propose a truncated additive Gaussian process model to calibrate approximate performance measurements generated by light training, using accurate performance measurements generated by heavy training. Based on the model, a sequential model-based algorithm is developed to generate the performance profile of the configuration space as well as find optimal ones. Our proposed approach demonstrates competitive performance when applied to optimize synthetic examples, support vector machines, fully connected networks and convolutional neural networks.

Related papers

ULTHO: Ultra-Lightweight yet Efficient Hyperparameter Optimization in Deep Reinforcement Learning [50.53705050673944]
We propose ULTHO, an ultra-lightweight yet powerful framework for fast HPO in deep RL within single runs. Specifically, we formulate the HPO process as a multi-armed bandit with clustered arms (MABC) and link it directly to long-term return optimization. We test ULTHO on benchmarks including ALE, Procgen, MiniGrid, and PyBullet.
arXiv Detail & Related papers (2025-03-08T07:03:43Z)
Optimization Hyper-parameter Laws for Large Language Models [56.322914260197734]
We present Opt-Laws, a framework that captures the relationship between hyper- parameters and training outcomes. Our validation across diverse model sizes and data scales demonstrates Opt-Laws' ability to accurately predict training loss. This approach significantly reduces computational costs while enhancing overall model performance.
arXiv Detail & Related papers (2024-09-07T09:37:19Z)
Model Performance Prediction for Hyperparameter Optimization of Deep Learning Models Using High Performance Computing and Quantum Annealing [0.0]
We show that integrating model performance prediction with early stopping methods holds great potential to speed up the HPO process of deep learning models. We propose a novel algorithm called Swift-Hyperband that can use either classical or quantum support vector regression for performance prediction.
arXiv Detail & Related papers (2023-11-29T10:32:40Z)
Deep Ranking Ensembles for Hyperparameter Optimization [9.453554184019108]
We present a novel method that meta-learns neural network surrogates optimized for ranking the configurations' performances while modeling their uncertainty via ensembling. In a large-scale experimental protocol comprising 12 baselines, 16 HPO search spaces and 86 datasets/tasks, we demonstrate that our method achieves new state-of-the-art results in HPO.
arXiv Detail & Related papers (2023-03-27T13:52:40Z)
Optimization-Derived Learning with Essential Convergence Analysis of Training and Hyper-training [52.39882976848064]
We design a Generalized Krasnoselskii-Mann (GKM) scheme based on fixed-point iterations as our fundamental ODL module. Under the GKM scheme, a Bilevel Meta Optimization (BMO) algorithmic framework is constructed to solve the optimal training and hyper-training variables together.
arXiv Detail & Related papers (2022-06-16T01:50:25Z)
Towards Learning Universal Hyperparameter Optimizers with Transformers [57.35920571605559]
We introduce the OptFormer, the first text-based Transformer HPO framework that provides a universal end-to-end interface for jointly learning policy and function prediction. Our experiments demonstrate that the OptFormer can imitate at least 7 different HPO algorithms, which can be further improved via its function uncertainty estimates.
arXiv Detail & Related papers (2022-05-26T12:51:32Z)
AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient Hyper-parameter Tuning [72.54359545547904]
We propose a gradient-based subset selection framework for hyper- parameter tuning. We show that using gradient-based data subsets for hyper- parameter tuning achieves significantly faster turnaround times and speedups of 3$times$-30$times$.
arXiv Detail & Related papers (2022-03-15T19:25:01Z)
Towards Robust and Automatic Hyper-Parameter Tunning [39.04604349338802]
We introduce a new class of HPO method and explore how the low-rank factorization of intermediate layers of a convolutional network can be used to define an analytical response surface. We quantify how this surface behaves as a surrogate to model performance and can be solved using a trust-region search algorithm, which we call autoHyper.
arXiv Detail & Related papers (2021-11-28T05:27:34Z)
Online hyperparameter optimization by real-time recurrent learning [57.01871583756586]
Our framework takes advantage of the analogy between hyperparameter optimization and parameter learning in neural networks (RNNs) It adapts a well-studied family of online learning algorithms for RNNs to tune hyperparameters and network parameters simultaneously. This procedure yields systematically better generalization performance compared to standard methods, at a fraction of wallclock time.
arXiv Detail & Related papers (2021-02-15T19:36:18Z)
Bayesian Optimization for Selecting Efficient Machine Learning Models [53.202224677485525]
We present a unified Bayesian Optimization framework for jointly optimizing models for both prediction effectiveness and training efficiency. Experiments on model selection for recommendation tasks indicate models selected this way significantly improves model training efficiency.
arXiv Detail & Related papers (2020-08-02T02:56:30Z)
Weighting Is Worth the Wait: Bayesian Optimization with Importance Sampling [34.67740033646052]
We improve upon Bayesian optimization state-of-the-art runtime and final validation error across a variety of datasets and complex neural architectures. By learning a parameterization of IS that trades-off evaluation complexity and quality, we improve upon Bayesian optimization state-of-the-art runtime and final validation error across a variety of datasets and complex neural architectures.
arXiv Detail & Related papers (2020-02-23T15:52:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.