Behavior of Hyper-Parameters for Selected Machine Learning Algorithms:
An Empirical Investigation
- URL: http://arxiv.org/abs/2211.08536v1
- Date: Tue, 15 Nov 2022 22:14:52 GMT
- Title: Behavior of Hyper-Parameters for Selected Machine Learning Algorithms:
An Empirical Investigation
- Authors: Anwesha Bhattacharyya, Joel Vaughan, and Vijayan N. Nair
- Abstract summary: Hyper- Parameters (HPs) are an important part of machine learning (ML) model development and can greatly influence performance.
This paper studies their behavior for three algorithms: Extreme Gradient Boosting (XGB), Random Forest (RF), and Feedforward Neural Network (FFNN) with structured data.
Our empirical investigation examines the qualitative behavior of model performance as the HPs vary, quantifies the importance of each HP for different ML algorithms, and stability of the performance near the optimal region.
- Score: 3.441021278275805
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hyper-parameters (HPs) are an important part of machine learning (ML) model
development and can greatly influence performance. This paper studies their
behavior for three algorithms: Extreme Gradient Boosting (XGB), Random Forest
(RF), and Feedforward Neural Network (FFNN) with structured data. Our empirical
investigation examines the qualitative behavior of model performance as the HPs
vary, quantifies the importance of each HP for different ML algorithms, and
stability of the performance near the optimal region. Based on the findings, we
propose a set of guidelines for efficient HP tuning by reducing the search
space.
Related papers
- A Comparative Study of Hyperparameter Tuning Methods [0.0]
Tree-structured Parzen Estimator (TPE), Genetic Search, and Random Search are evaluated across regression and classification tasks.
Random Search excelled in regression tasks, while TPE was more effective for classification tasks.
arXiv Detail & Related papers (2024-08-29T10:35:07Z) - Switchable Decision: Dynamic Neural Generation Networks [98.61113699324429]
We propose a switchable decision to accelerate inference by dynamically assigning resources for each data instance.
Our method benefits from less cost during inference while keeping the same accuracy.
arXiv Detail & Related papers (2024-05-07T17:44:54Z) - Robustness of Algorithms for Causal Structure Learning to Hyperparameter
Choice [2.3020018305241337]
Hyper parameter tuning can make the difference between state-of-the-art and poor prediction performance for any algorithm.
We investigate the influence of hyper parameter selection on causal structure learning tasks.
arXiv Detail & Related papers (2023-10-27T15:34:08Z) - Representation Learning with Multi-Step Inverse Kinematics: An Efficient
and Optimal Approach to Rich-Observation RL [106.82295532402335]
Existing reinforcement learning algorithms suffer from computational intractability, strong statistical assumptions, and suboptimal sample complexity.
We provide the first computationally efficient algorithm that attains rate-optimal sample complexity with respect to the desired accuracy level.
Our algorithm, MusIK, combines systematic exploration with representation learning based on multi-step inverse kinematics.
arXiv Detail & Related papers (2023-04-12T14:51:47Z) - Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting.
We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z) - Multi-objective hyperparameter optimization with performance uncertainty [62.997667081978825]
This paper presents results on multi-objective hyperparameter optimization with uncertainty on the evaluation of Machine Learning algorithms.
We combine the sampling strategy of Tree-structured Parzen Estimators (TPE) with the metamodel obtained after training a Gaussian Process Regression (GPR) with heterogeneous noise.
Experimental results on three analytical test functions and three ML problems show the improvement over multi-objective TPE and GPR.
arXiv Detail & Related papers (2022-09-09T14:58:43Z) - Evaluating natural language processing models with generalization
metrics that do not need access to any training or testing data [66.11139091362078]
We provide the first model selection results on large pretrained Transformers from Huggingface using generalization metrics.
Despite their niche status, we find that metrics derived from the heavy-tail (HT) perspective are particularly useful in NLP tasks.
arXiv Detail & Related papers (2022-02-06T20:07:35Z) - A survey on multi-objective hyperparameter optimization algorithms for
Machine Learning [62.997667081978825]
This article presents a systematic survey of the literature published between 2014 and 2020 on multi-objective HPO algorithms.
We distinguish between metaheuristic-based algorithms, metamodel-based algorithms, and approaches using a mixture of both.
We also discuss the quality metrics used to compare multi-objective HPO procedures and present future research directions.
arXiv Detail & Related papers (2021-11-23T10:22:30Z) - Genealogical Population-Based Training for Hyperparameter Optimization [1.0514231683620516]
We experimentally demonstrate that our method cuts down by 2 to 3 times the computational cost required.
Our method is search-algorithm so that the inner search routine can be any search algorithm like TPE, GP, CMA or random search.
arXiv Detail & Related papers (2021-09-30T08:49:41Z) - Experimental Investigation and Evaluation of Model-based Hyperparameter
Optimization [0.3058685580689604]
This article presents an overview of theoretical and practical results for popular machine learning algorithms.
The R package mlr is used as a uniform interface to the machine learning models.
arXiv Detail & Related papers (2021-07-19T11:37:37Z) - Better Trees: An empirical study on hyperparameter tuning of
classification decision tree induction algorithms [5.4611430411491115]
Decision Tree (DT) induction algorithms present high predictive performance and interpretable classification models.
This paper investigates the effects of hyperparameter tuning for the two DT induction algorithms most often used, CART and C4.5.
Experiments were carried out with different tuning strategies to induce models and to evaluate HPs' relevance using 94 classification datasets from OpenML.
arXiv Detail & Related papers (2018-12-05T19:59:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.