Related papers: Random Search Hyper-Parameter Tuning: Expected Improvement Estimation and the Corresponding Lower Bound

Random Search Hyper-Parameter Tuning: Expected Improvement Estimation and the Corresponding Lower Bound

URL: http://arxiv.org/abs/2208.08170v1
Date: Wed, 17 Aug 2022 09:18:43 GMT
Title: Random Search Hyper-Parameter Tuning: Expected Improvement Estimation and the Corresponding Lower Bound
Authors: Dan Navon, Alex M. Bronstein
Abstract summary: We establish an empirical estimate for the expected accuracy improvement from an additional iteration of hyperparameter search. We demonstrate that the optimal estimate for the expected accuracy will still have an error of $frac1k$.
Score: 8.486025595883117
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Hyperparameter tuning is a common technique for improving the performance of neural networks. Most techniques for hyperparameter search involve an iterated process where the model is retrained at every iteration. However, the expected accuracy improvement from every additional search iteration, is still unknown. Calculating the expected improvement can help create stopping rules for hyperparameter tuning and allow for a wiser allocation of a project's computational budget. In this paper, we establish an empirical estimate for the expected accuracy improvement from an additional iteration of hyperparameter search. Our results hold for any hyperparameter tuning method which is based on random search \cite{bergstra2012random} and samples hyperparameters from a fixed distribution. We bound our estimate with an error of $O\left(\sqrt{\frac{\log k}{k}}\right)$ w.h.p. where $k$ is the current number of iterations. To the best of our knowledge this is the first bound on the expected gain from an additional iteration of hyperparameter search. Finally, we demonstrate that the optimal estimate for the expected accuracy will still have an error of $\frac{1}{k}$.

Related papers

Multivariate root-n-consistent smoothing parameter free matching estimators and estimators of inverse density weighted expectations [51.000851088730684]
We develop novel modifications of nearest-neighbor and matching estimators which converge at the parametric $sqrt n $-rate.<n>We stress that our estimators do not involve nonparametric function estimators and in particular do not rely on sample-size dependent parameters smoothing.
arXiv Detail & Related papers (2024-07-11T13:28:34Z)
Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic Shortest Path [80.60592344361073]
We study the Shortest Path (SSP) problem with a linear mixture transition kernel. An agent repeatedly interacts with a environment and seeks to reach certain goal state while minimizing the cumulative cost. Existing works often assume a strictly positive lower bound of the iteration cost function or an upper bound of the expected length for the optimal policy.
arXiv Detail & Related papers (2024-02-14T07:52:00Z)
A Multi-objective Newton Optimization Algorithm for Hyper-Parameter Search [0.0]
The algorithm is applied to search the optimal probability threshold (a vector of eight parameters) for a multiclass object detection problem of a convolutional neural network. The algorithm produces an overall higher true positive (TP) and lower false positive (FP) rates, as compared to using the default value of 0.5.
arXiv Detail & Related papers (2024-01-07T21:12:34Z)
Using Sequential Statistical Tests to Improve the Performance of Random Search in hyperparameter Tuning [0.0]
Hyperparamter tuning is one of the the most time-consuming parts in machine learning. We propose a sequential testing procedure to minimize the number of resampling iterations to detect inferior parameter setting.
arXiv Detail & Related papers (2021-12-23T10:02:04Z)
Gaussian Process Uniform Error Bounds with Unknown Hyperparameters for Safety-Critical Applications [71.23286211775084]
We introduce robust Gaussian process uniform error bounds in settings with unknown hyper parameters. Our approach computes a confidence region in the space of hyper parameters, which enables us to obtain a probabilistic upper bound for the model error. Experiments show that the bound performs significantly better than vanilla and fully Bayesian processes.
arXiv Detail & Related papers (2021-09-06T17:10:01Z)
Reducing the Variance of Gaussian Process Hyperparameter Optimization with Preconditioning [54.01682318834995]
Preconditioning is a highly effective step for any iterative method involving matrix-vector multiplication. We prove that preconditioning has an additional benefit that has been previously unexplored. It simultaneously can reduce variance at essentially negligible cost.
arXiv Detail & Related papers (2021-07-01T06:43:11Z)
Bayesian Optimistic Optimisation with Exponentially Decaying Regret [58.02542541410322]
The current practical BO algorithms have regret bounds ranging from $mathcalO(fraclogNsqrtN)$ to $mathcal O(e-sqrtN)$, where $N$ is the number of evaluations. This paper explores the possibility of improving the regret bound in the noiseless setting by intertwining concepts from BO and tree-based optimistic optimisation. We propose the BOO algorithm, a first practical approach which can achieve an exponential regret bound with order $mathcal O(N-sqrt
arXiv Detail & Related papers (2021-05-10T13:07:44Z)
A Momentum-Assisted Single-Timescale Stochastic Approximation Algorithm for Bilevel Optimization [112.59170319105971]
We propose a new algorithm -- the Momentum- Single-timescale Approximation (MSTSA) -- for tackling problems. MSTSA allows us to control the error in iterations due to inaccurate solution to the lower level subproblem.
arXiv Detail & Related papers (2021-02-15T07:10:33Z)
How much progress have we made in neural network training? A New Evaluation Protocol for Benchmarking Optimizers [86.36020260204302]
We propose a new benchmarking protocol to evaluate both end-to-end efficiency and data-addition training efficiency. A human study is conducted to show that our evaluation protocol matches human tuning behavior better than the random search. We then apply the proposed benchmarking framework to 7s and various tasks, including computer vision, natural language processing, reinforcement learning, and graph mining.
arXiv Detail & Related papers (2020-10-19T21:46:39Z)
Automatic Setting of DNN Hyper-Parameters by Mixing Bayesian Optimization and Tuning Rules [0.6875312133832078]
We build a new algorithm for evaluating and analyzing the results of the network on the training and validation sets. We use a set of tuning rules to add new hyper-parameters and/or to reduce the hyper- parameter search space to select a better combination.
arXiv Detail & Related papers (2020-06-03T08:53:48Z)
Weighted Random Search for Hyperparameter Optimization [0.0]
We introduce an improved version of Random Search (RS), used here for hyper parameter optimization of machine learning algorithms. We generate new values for each hyper parameter with a probability of change, unlike the standard RS. Within the same computational budget, our method yields better results than the standard RS.
arXiv Detail & Related papers (2020-04-03T15:41:22Z)
Weighted Random Search for CNN Hyperparameter Optimization [0.0]
We introduce the weighted Random Search (WRS) method, a combination of Random Search (RS) and probabilistic greedy. The criterion is the classification accuracy achieved within the same number of tested combinations of hyperparameter values. According to our experiments, the WRS algorithm outperforms the other methods.
arXiv Detail & Related papers (2020-03-30T09:40:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.