Random Search Hyper-Parameter Tuning: Expected Improvement Estimation
and the Corresponding Lower Bound
- URL: http://arxiv.org/abs/2208.08170v1
- Date: Wed, 17 Aug 2022 09:18:43 GMT
- Title: Random Search Hyper-Parameter Tuning: Expected Improvement Estimation
and the Corresponding Lower Bound
- Authors: Dan Navon, Alex M. Bronstein
- Abstract summary: We establish an empirical estimate for the expected accuracy improvement from an additional iteration of hyperparameter search.
We demonstrate that the optimal estimate for the expected accuracy will still have an error of $frac1k$.
- Score: 8.486025595883117
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hyperparameter tuning is a common technique for improving the performance of
neural networks. Most techniques for hyperparameter search involve an iterated
process where the model is retrained at every iteration. However, the expected
accuracy improvement from every additional search iteration, is still unknown.
Calculating the expected improvement can help create stopping rules for
hyperparameter tuning and allow for a wiser allocation of a project's
computational budget. In this paper, we establish an empirical estimate for the
expected accuracy improvement from an additional iteration of hyperparameter
search. Our results hold for any hyperparameter tuning method which is based on
random search \cite{bergstra2012random} and samples hyperparameters from a
fixed distribution. We bound our estimate with an error of
$O\left(\sqrt{\frac{\log k}{k}}\right)$ w.h.p. where $k$ is the current number
of iterations. To the best of our knowledge this is the first bound on the
expected gain from an additional iteration of hyperparameter search. Finally,
we demonstrate that the optimal estimate for the expected accuracy will still
have an error of $\frac{1}{k}$.
Related papers
- Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic
Shortest Path [80.60592344361073]
We study the Shortest Path (SSP) problem with a linear mixture transition kernel.
An agent repeatedly interacts with a environment and seeks to reach certain goal state while minimizing the cumulative cost.
Existing works often assume a strictly positive lower bound of the iteration cost function or an upper bound of the expected length for the optimal policy.
arXiv Detail & Related papers (2024-02-14T07:52:00Z) - A Multi-objective Newton Optimization Algorithm for Hyper-Parameter
Search [0.0]
The algorithm is applied to search the optimal probability threshold (a vector of eight parameters) for a multiclass object detection problem of a convolutional neural network.
The algorithm produces an overall higher true positive (TP) and lower false positive (FP) rates, as compared to using the default value of 0.5.
arXiv Detail & Related papers (2024-01-07T21:12:34Z) - Using Sequential Statistical Tests to Improve the Performance of Random
Search in hyperparameter Tuning [0.0]
Hyperparamter tuning is one of the the most time-consuming parts in machine learning.
We propose a sequential testing procedure to minimize the number of resampling iterations to detect inferior parameter setting.
arXiv Detail & Related papers (2021-12-23T10:02:04Z) - Gaussian Process Uniform Error Bounds with Unknown Hyperparameters for
Safety-Critical Applications [71.23286211775084]
We introduce robust Gaussian process uniform error bounds in settings with unknown hyper parameters.
Our approach computes a confidence region in the space of hyper parameters, which enables us to obtain a probabilistic upper bound for the model error.
Experiments show that the bound performs significantly better than vanilla and fully Bayesian processes.
arXiv Detail & Related papers (2021-09-06T17:10:01Z) - Reducing the Variance of Gaussian Process Hyperparameter Optimization
with Preconditioning [54.01682318834995]
Preconditioning is a highly effective step for any iterative method involving matrix-vector multiplication.
We prove that preconditioning has an additional benefit that has been previously unexplored.
It simultaneously can reduce variance at essentially negligible cost.
arXiv Detail & Related papers (2021-07-01T06:43:11Z) - Bayesian Optimistic Optimisation with Exponentially Decaying Regret [58.02542541410322]
The current practical BO algorithms have regret bounds ranging from $mathcalO(fraclogNsqrtN)$ to $mathcal O(e-sqrtN)$, where $N$ is the number of evaluations.
This paper explores the possibility of improving the regret bound in the noiseless setting by intertwining concepts from BO and tree-based optimistic optimisation.
We propose the BOO algorithm, a first practical approach which can achieve an exponential regret bound with order $mathcal O(N-sqrt
arXiv Detail & Related papers (2021-05-10T13:07:44Z) - A Momentum-Assisted Single-Timescale Stochastic Approximation Algorithm
for Bilevel Optimization [112.59170319105971]
We propose a new algorithm -- the Momentum- Single-timescale Approximation (MSTSA) -- for tackling problems.
MSTSA allows us to control the error in iterations due to inaccurate solution to the lower level subproblem.
arXiv Detail & Related papers (2021-02-15T07:10:33Z) - How much progress have we made in neural network training? A New
Evaluation Protocol for Benchmarking Optimizers [86.36020260204302]
We propose a new benchmarking protocol to evaluate both end-to-end efficiency and data-addition training efficiency.
A human study is conducted to show that our evaluation protocol matches human tuning behavior better than the random search.
We then apply the proposed benchmarking framework to 7s and various tasks, including computer vision, natural language processing, reinforcement learning, and graph mining.
arXiv Detail & Related papers (2020-10-19T21:46:39Z) - Automatic Setting of DNN Hyper-Parameters by Mixing Bayesian
Optimization and Tuning Rules [0.6875312133832078]
We build a new algorithm for evaluating and analyzing the results of the network on the training and validation sets.
We use a set of tuning rules to add new hyper-parameters and/or to reduce the hyper- parameter search space to select a better combination.
arXiv Detail & Related papers (2020-06-03T08:53:48Z) - Weighted Random Search for Hyperparameter Optimization [0.0]
We introduce an improved version of Random Search (RS), used here for hyper parameter optimization of machine learning algorithms.
We generate new values for each hyper parameter with a probability of change, unlike the standard RS.
Within the same computational budget, our method yields better results than the standard RS.
arXiv Detail & Related papers (2020-04-03T15:41:22Z) - Weighted Random Search for CNN Hyperparameter Optimization [0.0]
We introduce the weighted Random Search (WRS) method, a combination of Random Search (RS) and probabilistic greedy.
The criterion is the classification accuracy achieved within the same number of tested combinations of hyperparameter values.
According to our experiments, the WRS algorithm outperforms the other methods.
arXiv Detail & Related papers (2020-03-30T09:40:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.