Efficient Hyperparameter Importance Assessment for CNNs
- URL: http://arxiv.org/abs/2410.08920v1
- Date: Fri, 11 Oct 2024 15:47:46 GMT
- Title: Efficient Hyperparameter Importance Assessment for CNNs
- Authors: Ruinan Wang, Ian Nabney, Mohammad Golbabaee,
- Abstract summary: This paper aims to quantify the importance weights of some hyperparameters in Convolutional Neural Networks (CNNs) with an algorithm called N-RReliefF.
We conduct an extensive study by training over ten thousand CNN models across ten popular image classification datasets.
- Score: 1.7778609937758323
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Hyperparameter selection is an essential aspect of the machine learning pipeline, profoundly impacting models' robustness, stability, and generalization capabilities. Given the complex hyperparameter spaces associated with Neural Networks and the constraints of computational resources and time, optimizing all hyperparameters becomes impractical. In this context, leveraging hyperparameter importance assessment (HIA) can provide valuable guidance by narrowing down the search space. This enables machine learning practitioners to focus their optimization efforts on the hyperparameters with the most significant impact on model performance while conserving time and resources. This paper aims to quantify the importance weights of some hyperparameters in Convolutional Neural Networks (CNNs) with an algorithm called N-RReliefF, laying the groundwork for applying HIA methodologies in the Deep Learning field. We conduct an extensive study by training over ten thousand CNN models across ten popular image classification datasets, thereby acquiring a comprehensive dataset containing hyperparameter configuration instances and their corresponding performance metrics. It is demonstrated that among the investigated hyperparameters, the top five important hyperparameters of the CNN model are the number of convolutional layers, learning rate, dropout rate, optimizer and epoch.
Related papers
- Optimization of Actuarial Neural Networks with Response Surface Methodology [0.0]
This study utilizes a factorial design and response surface methodology (RSM) to optimize CANN performance.
By dropping statistically insignificant hyper parameters, we reduced runs from 288 to 188, with negligible loss in accuracy, achieving near-optimal out-of-sample Poisson deviance loss.
arXiv Detail & Related papers (2024-10-01T15:45:41Z) - Optimization Hyper-parameter Laws for Large Language Models [56.322914260197734]
We present Opt-Laws, a framework that captures the relationship between hyper- parameters and training outcomes.
Our validation across diverse model sizes and data scales demonstrates Opt-Laws' ability to accurately predict training loss.
This approach significantly reduces computational costs while enhancing overall model performance.
arXiv Detail & Related papers (2024-09-07T09:37:19Z) - Goal-Oriented Sensitivity Analysis of Hyperparameters in Deep Learning [0.0]
We study the use of goal-oriented sensitivity analysis, based on the Hilbert-Schmidt Independence Criterion (HSIC), for hyperparameter analysis and optimization.
We derive an HSIC-based optimization algorithm that we apply on MNIST and Cifar, classical machine learning data sets, of interest for scientific machine learning.
arXiv Detail & Related papers (2022-07-13T14:21:12Z) - AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient
Hyper-parameter Tuning [72.54359545547904]
We propose a gradient-based subset selection framework for hyper- parameter tuning.
We show that using gradient-based data subsets for hyper- parameter tuning achieves significantly faster turnaround times and speedups of 3$times$-30$times$.
arXiv Detail & Related papers (2022-03-15T19:25:01Z) - To tune or not to tune? An Approach for Recommending Important
Hyperparameters [2.121963121603413]
We consider building the relationship between the performance of the machine learning models and their hyperparameters to discover the trend and gain insights.
Our results enable users to decide whether it is worth conducting a possibly time-consuming tuning strategy.
arXiv Detail & Related papers (2021-08-30T08:54:58Z) - HyperNP: Interactive Visual Exploration of Multidimensional Projection
Hyperparameters [61.354362652006834]
HyperNP is a scalable method that allows for real-time interactive exploration of projection methods by training neural network approximations.
We evaluate the performance of the HyperNP across three datasets in terms of performance and speed.
arXiv Detail & Related papers (2021-06-25T17:28:14Z) - On the Importance of Hyperparameter Optimization for Model-based
Reinforcement Learning [27.36718899899319]
Model-based Reinforcement Learning (MBRL) is a promising framework for learning control in a data-efficient manner.
MBRL typically requires significant human expertise before it can be applied to new problems and domains.
arXiv Detail & Related papers (2021-02-26T18:57:47Z) - Online hyperparameter optimization by real-time recurrent learning [57.01871583756586]
Our framework takes advantage of the analogy between hyperparameter optimization and parameter learning in neural networks (RNNs)
It adapts a well-studied family of online learning algorithms for RNNs to tune hyperparameters and network parameters simultaneously.
This procedure yields systematically better generalization performance compared to standard methods, at a fraction of wallclock time.
arXiv Detail & Related papers (2021-02-15T19:36:18Z) - On the Sparsity of Neural Machine Translation Models [65.49762428553345]
We investigate whether redundant parameters can be reused to achieve better performance.
Experiments and analyses are systematically conducted on different datasets and NMT architectures.
arXiv Detail & Related papers (2020-10-06T11:47:20Z) - An Asymptotically Optimal Multi-Armed Bandit Algorithm and
Hyperparameter Optimization [48.5614138038673]
We propose an efficient and robust bandit-based algorithm called Sub-Sampling (SS) in the scenario of hyper parameter search evaluation.
We also develop a novel hyper parameter optimization algorithm called BOSS.
Empirical studies validate our theoretical arguments of SS and demonstrate the superior performance of BOSS on a number of applications.
arXiv Detail & Related papers (2020-07-11T03:15:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.