Using Sequential Statistical Tests to Improve the Performance of Random
Search in hyperparameter Tuning
- URL: http://arxiv.org/abs/2112.12438v1
- Date: Thu, 23 Dec 2021 10:02:04 GMT
- Title: Using Sequential Statistical Tests to Improve the Performance of Random
Search in hyperparameter Tuning
- Authors: Philip Buczak and Daniel Horn
- Abstract summary: Hyperparamter tuning is one of the the most time-consuming parts in machine learning.
We propose a sequential testing procedure to minimize the number of resampling iterations to detect inferior parameter setting.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hyperparamter tuning is one of the the most time-consuming parts in machine
learning: The performance of a large number of different hyperparameter
settings has to be evaluated to find the best one. Although modern optimization
algorithms exist that minimize the number of evaluations needed, the evaluation
of a single setting is still expensive: Using a resampling technique, the
machine learning method has to be fitted a fixed number of $K$ times on
different training data sets. As an estimator for the performance of the
setting the respective mean value of the $K$ fits is used. Many hyperparameter
settings could be discarded after less than $K$ resampling iterations, because
they already are clearly inferior to high performing settings. However, in
practice, the resampling is often performed until the very end, wasting a lot
of computational effort.
We propose to use a sequential testing procedure to minimize the number of
resampling iterations to detect inferior parameter setting. To do so, we first
analyze the distribution of resampling errors, we will find out, that a
log-normal distribution is promising. Afterwards, we build a sequential testing
procedure assuming this distribution. This sequential test procedure is
utilized within a random search algorithm.
We compare a standard random search with our enhanced sequential random
search in some realistic data situation. It can be shown that the sequential
random search is able to find comparably good hyperparameter settings, however,
the computational time needed to find those settings is roughly halved.
Related papers
- Online Continuous Hyperparameter Optimization for Generalized Linear Contextual Bandits [55.03293214439741]
In contextual bandits, an agent sequentially makes actions from a time-dependent action set based on past experience.
We propose the first online continuous hyperparameter tuning framework for contextual bandits.
We show that it could achieve a sublinear regret in theory and performs consistently better than all existing methods on both synthetic and real datasets.
arXiv Detail & Related papers (2023-02-18T23:31:20Z) - Practical Differentially Private Hyperparameter Tuning with Subsampling [8.022555128083026]
We propose a new class of differentially private (DP) machine learning (ML) algorithms, where the number of random search samples is randomized itself.
We focus on lowering both the DP bounds and the computational cost of these methods by using only a random subset of the sensitive data.
We provide a R'enyi differential privacy analysis for the proposed method and experimentally show that it consistently leads to better privacy-utility trade-off.
arXiv Detail & Related papers (2023-01-27T21:01:58Z) - Random Search Hyper-Parameter Tuning: Expected Improvement Estimation
and the Corresponding Lower Bound [8.486025595883117]
We establish an empirical estimate for the expected accuracy improvement from an additional iteration of hyperparameter search.
We demonstrate that the optimal estimate for the expected accuracy will still have an error of $frac1k$.
arXiv Detail & Related papers (2022-08-17T09:18:43Z) - Reducing the Variance of Gaussian Process Hyperparameter Optimization
with Preconditioning [54.01682318834995]
Preconditioning is a highly effective step for any iterative method involving matrix-vector multiplication.
We prove that preconditioning has an additional benefit that has been previously unexplored.
It simultaneously can reduce variance at essentially negligible cost.
arXiv Detail & Related papers (2021-07-01T06:43:11Z) - Local policy search with Bayesian optimization [73.0364959221845]
Reinforcement learning aims to find an optimal policy by interaction with an environment.
Policy gradients for local search are often obtained from random perturbations.
We develop an algorithm utilizing a probabilistic model of the objective function and its gradient.
arXiv Detail & Related papers (2021-06-22T16:07:02Z) - How much progress have we made in neural network training? A New
Evaluation Protocol for Benchmarking Optimizers [86.36020260204302]
We propose a new benchmarking protocol to evaluate both end-to-end efficiency and data-addition training efficiency.
A human study is conducted to show that our evaluation protocol matches human tuning behavior better than the random search.
We then apply the proposed benchmarking framework to 7s and various tasks, including computer vision, natural language processing, reinforcement learning, and graph mining.
arXiv Detail & Related papers (2020-10-19T21:46:39Z) - How to tune the RBF SVM hyperparameters?: An empirical evaluation of 18
search algorithms [4.394728504061753]
We propose 18 proposed search algorithms for 115 real-life binary data sets.
We find that Parss better searches with only a slight increase in time with respect to the same tree with with respect to the grid.
We also find that there are no significant differences among the different procedures to the best set of data when more than one is found by the search algorithms.
arXiv Detail & Related papers (2020-08-26T16:28:48Z) - Best-First Beam Search [78.71330480725668]
We show that the standard implementation of beam search can be made up to 10x faster in practice.
We propose a memory-reduced variant of Best-First Beam Search, which has a similar beneficial search bias in terms of downstream performance.
arXiv Detail & Related papers (2020-07-08T05:56:01Z) - Exploiting Higher Order Smoothness in Derivative-free Optimization and
Continuous Bandits [99.70167985955352]
We study the problem of zero-order optimization of a strongly convex function.
We consider a randomized approximation of the projected gradient descent algorithm.
Our results imply that the zero-order algorithm is nearly optimal in terms of sample complexity and the problem parameters.
arXiv Detail & Related papers (2020-06-14T10:42:23Z) - Weighted Random Search for Hyperparameter Optimization [0.0]
We introduce an improved version of Random Search (RS), used here for hyper parameter optimization of machine learning algorithms.
We generate new values for each hyper parameter with a probability of change, unlike the standard RS.
Within the same computational budget, our method yields better results than the standard RS.
arXiv Detail & Related papers (2020-04-03T15:41:22Z) - Weighted Random Search for CNN Hyperparameter Optimization [0.0]
We introduce the weighted Random Search (WRS) method, a combination of Random Search (RS) and probabilistic greedy.
The criterion is the classification accuracy achieved within the same number of tested combinations of hyperparameter values.
According to our experiments, the WRS algorithm outperforms the other methods.
arXiv Detail & Related papers (2020-03-30T09:40:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.