Related papers: Hyperparameter Importance for Machine Learning Algorithms

Hyperparameter Importance for Machine Learning Algorithms

URL: http://arxiv.org/abs/2201.05132v1
Date: Thu, 13 Jan 2022 18:44:53 GMT
Title: Hyperparameter Importance for Machine Learning Algorithms
Authors: Honghe Jin
Abstract summary: The proposed importance on subsets of data is consistent with the one on the population data under weak conditions. Numerical experiments show that the proposed importance is consistent and can save a lot of computational resources.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Hyperparameter plays an essential role in the fitting of supervised machine learning algorithms. However, it is computationally expensive to tune all the tunable hyperparameters simultaneously especially for large data sets. In this paper, we give a definition of hyperparameter importance that can be estimated by subsampling procedures. According to the importance, hyperparameters can then be tuned on the entire data set more efficiently. We show theoretically that the proposed importance on subsets of data is consistent with the one on the population data under weak conditions. Numerical experiments show that the proposed importance is consistent and can save a lot of computational resources.

Related papers

Predictable Scale: Part I -- Optimal Hyperparameter Scaling Law in Large Language Model Pretraining [56.58170370127227]
We show that optimal learning rate follows a power-law relationship with both model parameters and data sizes, while optimal batch size scales primarily with data sizes. This work is the first work that unifies different model shapes and structures, such as Mixture-of-Experts models and dense transformers.
arXiv Detail & Related papers (2025-03-06T18:58:29Z)
Sample complexity of data-driven tuning of model hyperparameters in neural networks with structured parameter-dependent dual function [24.457000214575245]
We introduce a new technique to characterize the discontinuities and oscillations of the utility function on any fixed problem instance. This can be used to show that the learning theoretic complexity of the corresponding family of utility functions is bounded.
arXiv Detail & Related papers (2025-01-23T15:10:51Z)
Efficient Hyperparameter Importance Assessment for CNNs [1.7778609937758323]
This paper aims to quantify the importance weights of some hyperparameters in Convolutional Neural Networks (CNNs) with an algorithm called N-RReliefF. We conduct an extensive study by training over ten thousand CNN models across ten popular image classification datasets.
arXiv Detail & Related papers (2024-10-11T15:47:46Z)
Hyperparameter Importance Analysis for Multi-Objective AutoML [14.336028105614824]
In this paper, we propose the first method for assessing the importance of hyperparameters in the context of multi-objective hyperparameter optimization. Specifically, we compute the a-priori scalarization of the objectives and determine the importance of the hyperparameters for different objective tradeoffs. Our findings offer valuable guidance for hyperparameter tuning in MOO tasks but also contribute to advancing the understanding of HPI in complex optimization scenarios.
arXiv Detail & Related papers (2024-05-13T11:00:25Z)
Pre-training helps Bayesian optimization too [49.28382118032923]
We seek an alternative practice for setting functional priors. In particular, we consider the scenario where we have data from similar functions that allow us to pre-train a tighter distribution a priori. Our results show that our method is able to locate good hyper parameters at least 3 times more efficiently than the best competing methods.
arXiv Detail & Related papers (2022-07-07T04:42:54Z)
Parameter-Efficient Sparsity for Large Language Models Fine-Tuning [63.321205487234074]
We propose a. sparse-efficient Sparse Training (PST) method to reduce the number of trainable parameters during sparse-aware training. Experiments with diverse networks (i.e., BERT, RoBERTa and GPT-2) demonstrate PST performs on par or better than previous sparsity methods.
arXiv Detail & Related papers (2022-05-23T02:43:45Z)
An Extension to Basis-Hypervectors for Learning from Circular Data in Hyperdimensional Computing [62.997667081978825]
Hyperdimensional Computing (HDC) is a computation framework based on properties of high-dimensional random spaces. We present a study on basis-hypervector sets, which leads to practical contributions to HDC in general. We introduce a method to learn from circular data, an important type of information never before addressed in machine learning with HDC.
arXiv Detail & Related papers (2022-05-16T18:04:55Z)
AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient Hyper-parameter Tuning [72.54359545547904]
We propose a gradient-based subset selection framework for hyper- parameter tuning. We show that using gradient-based data subsets for hyper- parameter tuning achieves significantly faster turnaround times and speedups of 3$times$-30$times$.
arXiv Detail & Related papers (2022-03-15T19:25:01Z)
Experimental Investigation and Evaluation of Model-based Hyperparameter Optimization [0.3058685580689604]
This article presents an overview of theoretical and practical results for popular machine learning algorithms. The R package mlr is used as a uniform interface to the machine learning models.
arXiv Detail & Related papers (2021-07-19T11:37:37Z)
HyperNP: Interactive Visual Exploration of Multidimensional Projection Hyperparameters [61.354362652006834]
HyperNP is a scalable method that allows for real-time interactive exploration of projection methods by training neural network approximations. We evaluate the performance of the HyperNP across three datasets in terms of performance and speed.
arXiv Detail & Related papers (2021-06-25T17:28:14Z)
Amortized Auto-Tuning: Cost-Efficient Transfer Optimization for Hyperparameter Recommendation [83.85021205445662]
We propose an instantiation--amortized auto-tuning (AT2) to speed up tuning of machine learning models. We conduct a thorough analysis of the multi-task multi-fidelity Bayesian optimization framework, which leads to the best instantiation--amortized auto-tuning (AT2)
arXiv Detail & Related papers (2021-06-17T00:01:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.