Related papers: Weighting Is Worth the Wait: Bayesian Optimization with Importance Sampling

Weighting Is Worth the Wait: Bayesian Optimization with Importance Sampling

URL: http://arxiv.org/abs/2002.09927v1
Date: Sun, 23 Feb 2020 15:52:08 GMT
Title: Weighting Is Worth the Wait: Bayesian Optimization with Importance Sampling
Authors: Setareh Ariafar, Zelda Mariet, Ehsan Elhamifar, Dana Brooks, Jennifer Dy and Jasper Snoek
Abstract summary: We improve upon Bayesian optimization state-of-the-art runtime and final validation error across a variety of datasets and complex neural architectures. By learning a parameterization of IS that trades-off evaluation complexity and quality, we improve upon Bayesian optimization state-of-the-art runtime and final validation error across a variety of datasets and complex neural architectures.
Score: 34.67740033646052
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Many contemporary machine learning models require extensive tuning of hyperparameters to perform well. A variety of methods, such as Bayesian optimization, have been developed to automate and expedite this process. However, tuning remains extremely costly as it typically requires repeatedly fully training models. We propose to accelerate the Bayesian optimization approach to hyperparameter tuning for neural networks by taking into account the relative amount of information contributed by each training example. To do so, we leverage importance sampling (IS); this significantly increases the quality of the black-box function evaluations, but also their runtime, and so must be done carefully. Casting hyperparameter search as a multi-task Bayesian optimization problem over both hyperparameters and importance sampling design achieves the best of both worlds: by learning a parameterization of IS that trades-off evaluation complexity and quality, we improve upon Bayesian optimization state-of-the-art runtime and final validation error across a variety of datasets and complex neural architectures.

Related papers

A Trajectory-Based Bayesian Approach to Multi-Objective Hyperparameter Optimization with Epoch-Aware Trade-Offs [8.598456741786801]
Training machine learning models inherently involve a resource-intensive and noisy iterative learning procedure.<n>We present a trajectory-based multi-objective Bayesian optimization algorithm characterized by two features.<n> Experiments show that our algorithm can effectively identify the desirable trade-offs while improving tuning efficiency.
arXiv Detail & Related papers (2024-05-24T07:43:45Z)
Improving Multi-fidelity Optimization with a Recurring Learning Rate for Hyperparameter Tuning [7.591442522626255]
We propose Multi-fidelity Optimization with a Recurring Learning rate (MORL) MORL incorporates CNNs' optimization process into multi-fidelity optimization. It alleviates the problem of slow-starter and achieves a more precise low-fidelity approximation.
arXiv Detail & Related papers (2022-09-26T08:16:31Z)
Pre-training helps Bayesian optimization too [49.28382118032923]
We seek an alternative practice for setting functional priors. In particular, we consider the scenario where we have data from similar functions that allow us to pre-train a tighter distribution a priori. Our results show that our method is able to locate good hyper parameters at least 3 times more efficiently than the best competing methods.
arXiv Detail & Related papers (2022-07-07T04:42:54Z)
AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient Hyper-parameter Tuning [72.54359545547904]
We propose a gradient-based subset selection framework for hyper- parameter tuning. We show that using gradient-based data subsets for hyper- parameter tuning achieves significantly faster turnaround times and speedups of 3$times$-30$times$.
arXiv Detail & Related papers (2022-03-15T19:25:01Z)
Hyper-parameter optimization based on soft actor critic and hierarchical mixture regularization [5.063728016437489]
We model hyper- parameter optimization process as a Markov decision process, and tackle it with reinforcement learning. A novel hyper- parameter optimization method based on soft actor critic and hierarchical mixture regularization has been proposed.
arXiv Detail & Related papers (2021-12-08T02:34:43Z)
Amortized Auto-Tuning: Cost-Efficient Transfer Optimization for Hyperparameter Recommendation [83.85021205445662]
We propose an instantiation--amortized auto-tuning (AT2) to speed up tuning of machine learning models. We conduct a thorough analysis of the multi-task multi-fidelity Bayesian optimization framework, which leads to the best instantiation--amortized auto-tuning (AT2)
arXiv Detail & Related papers (2021-06-17T00:01:18Z)
Online hyperparameter optimization by real-time recurrent learning [57.01871583756586]
Our framework takes advantage of the analogy between hyperparameter optimization and parameter learning in neural networks (RNNs) It adapts a well-studied family of online learning algorithms for RNNs to tune hyperparameters and network parameters simultaneously. This procedure yields systematically better generalization performance compared to standard methods, at a fraction of wallclock time.
arXiv Detail & Related papers (2021-02-15T19:36:18Z)
Bayesian Optimization for Selecting Efficient Machine Learning Models [53.202224677485525]
We present a unified Bayesian Optimization framework for jointly optimizing models for both prediction effectiveness and training efficiency. Experiments on model selection for recommendation tasks indicate models selected this way significantly improves model training efficiency.
arXiv Detail & Related papers (2020-08-02T02:56:30Z)
Multi-level Training and Bayesian Optimization for Economical Hyperparameter Optimization [12.92634461859467]
In this paper, we develop an effective approach to reducing the total amount of required training time for Hyperparameter Optimization. We propose a truncated additive Gaussian process model to calibrate approximate performance measurements generated by light training. Based on the model, a sequential model-based algorithm is developed to generate the performance profile of the configuration space as well as find optimal ones.
arXiv Detail & Related papers (2020-07-20T09:03:02Z)
Automatically Learning Compact Quality-aware Surrogates for Optimization Problems [55.94450542785096]
Solving optimization problems with unknown parameters requires learning a predictive model to predict the values of the unknown parameters and then solving the problem using these values. Recent work has shown that including the optimization problem as a layer in a complex training model pipeline results in predictions of iteration of unobserved decision making. We show that we can improve solution quality by learning a low-dimensional surrogate model of a large optimization problem.
arXiv Detail & Related papers (2020-06-18T19:11:54Z)
Automatic Setting of DNN Hyper-Parameters by Mixing Bayesian Optimization and Tuning Rules [0.6875312133832078]
We build a new algorithm for evaluating and analyzing the results of the network on the training and validation sets. We use a set of tuning rules to add new hyper-parameters and/or to reduce the hyper- parameter search space to select a better combination.
arXiv Detail & Related papers (2020-06-03T08:53:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.