Related papers: Beyond algorithm hyperparameters: on preprocessing hyperparameters and associated pitfalls in machine learning applications

Beyond algorithm hyperparameters: on preprocessing hyperparameters and associated pitfalls in machine learning applications

URL: http://arxiv.org/abs/2412.03491v1
Date: Wed, 04 Dec 2024 17:29:10 GMT
Title: Beyond algorithm hyperparameters: on preprocessing hyperparameters and associated pitfalls in machine learning applications
Authors: Christina Sauer, Anne-Laure Boulesteix, Luzia Hanßum, Farina Hodiamont, Claudia Bausewein, Theresa Ullmann,
Abstract summary: This paper reviews and empirically illustrates different procedures for generating and evaluating prediction models.<n>By highlighting potential pitfalls, especially those that may lead to exaggerated performance claims, this review aims to further improve the quality of predictive modeling in ML applications.
Score: 0.30786914102688595
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Adequately generating and evaluating prediction models based on supervised machine learning (ML) is often challenging, especially for less experienced users in applied research areas. Special attention is required in settings where the model generation process involves hyperparameter tuning, i.e. data-driven optimization of different types of hyperparameters to improve the predictive performance of the resulting model. Discussions about tuning typically focus on the hyperparameters of the ML algorithm (e.g., the minimum number of observations in each terminal node for a tree-based algorithm). In this context, it is often neglected that hyperparameters also exist for the preprocessing steps that are applied to the data before it is provided to the algorithm (e.g., how to handle missing feature values in the data). As a consequence, users experimenting with different preprocessing options to improve model performance may be unaware that this constitutes a form of hyperparameter tuning - albeit informal and unsystematic - and thus may fail to report or account for this optimization. To illuminate this issue, this paper reviews and empirically illustrates different procedures for generating and evaluating prediction models, explicitly addressing the different ways algorithm and preprocessing hyperparameters are typically handled by applied ML users. By highlighting potential pitfalls, especially those that may lead to exaggerated performance claims, this review aims to further improve the quality of predictive modeling in ML applications.

Related papers

Taming Hyperparameter Sensitivity in Data Attribution: Practical Selection Without Costly Retraining [10.018043411223125]
Data attribution methods quantify the influence of individual training data points on a machine learning model.<n>Despite a recent surge of new methods developed in this space, the impact of hyperparameter tuning in these methods remains under-explored.
arXiv Detail & Related papers (2025-05-30T06:33:56Z)
SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction. SMILE allows for the upscaling of source models into an MoE model without extra data or further training. We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z)
Scaling Exponents Across Parameterizations and Optimizers [94.54718325264218]
We propose a new perspective on parameterization by investigating a key assumption in prior work. Our empirical investigation includes tens of thousands of models trained with all combinations of threes. We find that the best learning rate scaling prescription would often have been excluded by the assumptions in prior work.
arXiv Detail & Related papers (2024-07-08T12:32:51Z)
End-to-End Learning for Fair Multiobjective Optimization Under Uncertainty [55.04219793298687]
The Predict-Then-Forecast (PtO) paradigm in machine learning aims to maximize downstream decision quality. This paper extends the PtO methodology to optimization problems with nondifferentiable Ordered Weighted Averaging (OWA) objectives. It shows how optimization of OWA functions can be effectively integrated with parametric prediction for fair and robust optimization under uncertainty.
arXiv Detail & Related papers (2024-02-12T16:33:35Z)
Hyperparameter Tuning for Causal Inference with Double Machine Learning: A Simulation Study [4.526082390949313]
We empirically assess the relationship between the predictive performance of machine learning methods and the resulting causal estimation. We conduct an extensive simulation study using data from the 2019 Atlantic Causal Inference Conference Data Challenge.
arXiv Detail & Related papers (2024-02-07T09:01:51Z)
Interactive Hyperparameter Optimization in Multi-Objective Problems via Preference Learning [65.51668094117802]
We propose a human-centered interactive HPO approach tailored towards multi-objective machine learning (ML) Instead of relying on the user guessing the most suitable indicator for their needs, our approach automatically learns an appropriate indicator.
arXiv Detail & Related papers (2023-09-07T09:22:05Z)
PriorCVAE: scalable MCMC parameter inference with Bayesian deep generative modelling [12.820453440015553]
Recent have shown that GP priors can be encoded using deep generative models such as variational autoencoders (VAEs) We show how VAEs can serve as drop-in replacements for the original priors during MCMC inference. We propose PriorCVAE to encode solutions of ODEs.
arXiv Detail & Related papers (2023-04-09T20:23:26Z)
Optimization of Annealed Importance Sampling Hyperparameters [77.34726150561087]
Annealed Importance Sampling (AIS) is a popular algorithm used to estimates the intractable marginal likelihood of deep generative models. We present a parameteric AIS process with flexible intermediary distributions and optimize the bridging distributions to use fewer number of steps for sampling. We assess the performance of our optimized AIS for marginal likelihood estimation of deep generative models and compare it to other estimators.
arXiv Detail & Related papers (2022-09-27T07:58:25Z)
Scalable Gaussian Process Hyperparameter Optimization via Coverage Regularization [0.0]
We present a novel algorithm which estimates the smoothness and length-scale parameters in the Matern kernel in order to improve robustness of the resulting prediction uncertainties. We achieve improved UQ over leave-one-out likelihood while maintaining a high degree of scalability as demonstrated in numerical experiments.
arXiv Detail & Related papers (2022-09-22T19:23:37Z)
Multi-objective hyperparameter optimization with performance uncertainty [62.997667081978825]
This paper presents results on multi-objective hyperparameter optimization with uncertainty on the evaluation of Machine Learning algorithms. We combine the sampling strategy of Tree-structured Parzen Estimators (TPE) with the metamodel obtained after training a Gaussian Process Regression (GPR) with heterogeneous noise. Experimental results on three analytical test functions and three ML problems show the improvement over multi-objective TPE and GPR.
arXiv Detail & Related papers (2022-09-09T14:58:43Z)
Hyperboost: Hyperparameter Optimization by Gradient Boosting surrogate models [0.4079265319364249]
Current state-of-the-art methods leverage Random Forests or Gaussian processes to build a surrogate model. We propose a new surrogate model based on gradient boosting. We demonstrate empirically that the new method is able to outperform some state-of-the art techniques across a reasonable sized set of classification problems.
arXiv Detail & Related papers (2021-01-06T22:07:19Z)
VisEvol: Visual Analytics to Support Hyperparameter Search through Evolutionary Optimization [4.237343083490243]
During the training phase of machine learning (ML) models, it is usually necessary to configure several hyper parameters. We present VisEvol, a visual analytics tool that supports interactive exploration of hyper parameters and intervention in this evolutionary procedure. The utility and applicability of VisEvol are demonstrated with two use cases and interviews with ML experts who evaluated the effectiveness of the tool.
arXiv Detail & Related papers (2020-12-02T13:43:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.