Importance of Tuning Hyperparameters of Machine Learning Algorithms
- URL: http://arxiv.org/abs/2007.07588v1
- Date: Wed, 15 Jul 2020 10:06:59 GMT
- Title: Importance of Tuning Hyperparameters of Machine Learning Algorithms
- Authors: Hilde J.P. Weerts, Andreas C. Mueller, Joaquin Vanschoren
- Abstract summary: We present a methodology to determine the importance of tuning a hyperparameter based on a non-inferiority test and tuning risk.
We apply our methods in a benchmark study using 59 datasets from OpenML.
- Score: 3.4161707164978137
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The performance of many machine learning algorithms depends on their
hyperparameter settings. The goal of this study is to determine whether it is
important to tune a hyperparameter or whether it can be safely set to a default
value. We present a methodology to determine the importance of tuning a
hyperparameter based on a non-inferiority test and tuning risk: the performance
loss that is incurred when a hyperparameter is not tuned, but set to a default
value. Because our methods require the notion of a default parameter, we
present a simple procedure that can be used to determine reasonable default
parameters. We apply our methods in a benchmark study using 59 datasets from
OpenML. Our results show that leaving particular hyperparameters at their
default value is non-inferior to tuning these hyperparameters. In some cases,
leaving the hyperparameter at its default value even outperforms tuning it
using a search procedure with a limited number of iterations.
Related papers
- Training neural networks faster with minimal tuning using pre-computed lists of hyperparameters for NAdamW [11.681640186200951]
We present a set of practical and performant hyper parameter lists for NAdamW.
Our best NAdamW hyper parameter list performs well on AlgoPerf held-out workloads not used to construct it.
It also outperforms basic learning rate/weight decay sweeps and an off-the-shelf Bayesian optimization tool when restricted to the same budget.
arXiv Detail & Related papers (2025-03-06T00:14:50Z) - Scaling Exponents Across Parameterizations and Optimizers [94.54718325264218]
We propose a new perspective on parameterization by investigating a key assumption in prior work.
Our empirical investigation includes tens of thousands of models trained with all combinations of threes.
We find that the best learning rate scaling prescription would often have been excluded by the assumptions in prior work.
arXiv Detail & Related papers (2024-07-08T12:32:51Z) - ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane Reflections [59.839926875976225]
We propose the ETHER transformation family, which performs Efficient fineTuning via HypErplane Reflections.
In particular, we introduce ETHER and its relaxation ETHER+, which match or outperform existing PEFT methods with significantly fewer parameters.
arXiv Detail & Related papers (2024-05-30T17:26:02Z) - Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning [91.5113227694443]
We propose a novel visual.
sensuous-aware fine-Tuning (SPT) scheme.
SPT allocates trainable parameters to task-specific important positions.
Experiments on a wide range of downstream recognition tasks show that our SPT is complementary to the existing PEFT methods.
arXiv Detail & Related papers (2023-03-15T12:34:24Z) - No More Pesky Hyperparameters: Offline Hyperparameter Tuning for RL [28.31529154045046]
We propose a new approach to tune hyperparameters from offline logs of data.
We first learn a model of the environment from the offline data, which we call a calibration model, and then simulate learning in the calibration model.
We empirically investigate the method in a variety of settings to identify when it is effective and when it fails.
arXiv Detail & Related papers (2022-05-18T04:26:23Z) - AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient
Hyper-parameter Tuning [72.54359545547904]
We propose a gradient-based subset selection framework for hyper- parameter tuning.
We show that using gradient-based data subsets for hyper- parameter tuning achieves significantly faster turnaround times and speedups of 3$times$-30$times$.
arXiv Detail & Related papers (2022-03-15T19:25:01Z) - Gaussian Process Uniform Error Bounds with Unknown Hyperparameters for
Safety-Critical Applications [71.23286211775084]
We introduce robust Gaussian process uniform error bounds in settings with unknown hyper parameters.
Our approach computes a confidence region in the space of hyper parameters, which enables us to obtain a probabilistic upper bound for the model error.
Experiments show that the bound performs significantly better than vanilla and fully Bayesian processes.
arXiv Detail & Related papers (2021-09-06T17:10:01Z) - Self-supervised learning for fast and scalable time series
hyper-parameter tuning [14.9124328578934]
Hyper-parameters of time series models play an important role in time series analysis.
We propose a self-supervised learning framework for HPT (SSL-HPT)
arXiv Detail & Related papers (2021-02-10T21:16:13Z) - Automatic Setting of DNN Hyper-Parameters by Mixing Bayesian
Optimization and Tuning Rules [0.6875312133832078]
We build a new algorithm for evaluating and analyzing the results of the network on the training and validation sets.
We use a set of tuning rules to add new hyper-parameters and/or to reduce the hyper- parameter search space to select a better combination.
arXiv Detail & Related papers (2020-06-03T08:53:48Z) - Weighted Random Search for Hyperparameter Optimization [0.0]
We introduce an improved version of Random Search (RS), used here for hyper parameter optimization of machine learning algorithms.
We generate new values for each hyper parameter with a probability of change, unlike the standard RS.
Within the same computational budget, our method yields better results than the standard RS.
arXiv Detail & Related papers (2020-04-03T15:41:22Z) - Weighted Random Search for CNN Hyperparameter Optimization [0.0]
We introduce the weighted Random Search (WRS) method, a combination of Random Search (RS) and probabilistic greedy.
The criterion is the classification accuracy achieved within the same number of tested combinations of hyperparameter values.
According to our experiments, the WRS algorithm outperforms the other methods.
arXiv Detail & Related papers (2020-03-30T09:40:14Z) - Rethinking the Hyperparameters for Fine-tuning [78.15505286781293]
Fine-tuning from pre-trained ImageNet models has become the de-facto standard for various computer vision tasks.
Current practices for fine-tuning typically involve selecting an ad-hoc choice of hyper parameters.
This paper re-examines several common practices of setting hyper parameters for fine-tuning.
arXiv Detail & Related papers (2020-02-19T18:59:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.