Explaining Hyperparameter Optimization via Partial Dependence Plots
- URL: http://arxiv.org/abs/2111.04820v1
- Date: Mon, 8 Nov 2021 20:51:54 GMT
- Title: Explaining Hyperparameter Optimization via Partial Dependence Plots
- Authors: Julia Moosbauer, Julia Herbinger, Giuseppe Casalicchio, Marius
Lindauer, Bernd Bischl
- Abstract summary: We suggest using interpretable machine learning (IML) to gain insights from the experimental data obtained during HPO with Bayesian optimization (BO)
By leveraging the posterior uncertainty of the BO surrogate model, we introduce a variant of the partial dependence plot ( PDP) with estimated confidence bands.
In an experimental study, we provide quantitative evidence for the increased quality of the PDPs within sub-regions.
- Score: 5.25855526614851
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automated hyperparameter optimization (HPO) can support practitioners to
obtain peak performance in machine learning models. However, there is often a
lack of valuable insights into the effects of different hyperparameters on the
final model performance. This lack of explainability makes it difficult to
trust and understand the automated HPO process and its results. We suggest
using interpretable machine learning (IML) to gain insights from the
experimental data obtained during HPO with Bayesian optimization (BO). BO tends
to focus on promising regions with potential high-performance configurations
and thus induces a sampling bias. Hence, many IML techniques, such as the
partial dependence plot (PDP), carry the risk of generating biased
interpretations. By leveraging the posterior uncertainty of the BO surrogate
model, we introduce a variant of the PDP with estimated confidence bands. We
propose to partition the hyperparameter space to obtain more confident and
reliable PDPs in relevant sub-regions. In an experimental study, we provide
quantitative evidence for the increased quality of the PDPs within sub-regions.
Related papers
- Scalable and Effective Negative Sample Generation for Hyperedge Prediction [55.9298019975967]
Hyperedge prediction is crucial for understanding complex multi-entity interactions in web-based applications.
Traditional methods often face difficulties in generating high-quality negative samples due to imbalance between positive and negative instances.
We present the scalable and effective negative sample generation for Hyperedge Prediction (SEHP) framework, which utilizes diffusion models to tackle these challenges.
arXiv Detail & Related papers (2024-11-19T09:16:25Z) - R+R:Understanding Hyperparameter Effects in DP-SGD [3.0668784884950235]
DP-SGD is the standard optimization algorithm for privacy-preserving machine learning.
It is still commonly challenged by low performance compared to non-private learning approaches.
arXiv Detail & Related papers (2024-11-04T12:56:35Z) - PriorBand: Practical Hyperparameter Optimization in the Age of Deep
Learning [49.92394599459274]
We propose PriorBand, an HPO algorithm tailored to Deep Learning (DL) pipelines.
We show its robustness across a range of DL benchmarks and show its gains under informative expert input and against poor expert beliefs.
arXiv Detail & Related papers (2023-06-21T16:26:14Z) - Prediction-Oriented Bayesian Active Learning [51.426960808684655]
Expected predictive information gain (EPIG) is an acquisition function that measures information gain in the space of predictions rather than parameters.
EPIG leads to stronger predictive performance compared with BALD across a range of datasets and models.
arXiv Detail & Related papers (2023-04-17T10:59:57Z) - Enhancing Explainability of Hyperparameter Optimization via Bayesian
Algorithm Execution [13.037647287689438]
We study the combination of HPO with interpretable machine learning (IML) methods such as partial dependence plots.
We propose a modified HPO method which efficiently searches for optimum global predictive performance.
Our method returns more reliable explanations of the underlying black-box without a loss of optimization performance.
arXiv Detail & Related papers (2022-06-11T07:12:04Z) - Towards Learning Universal Hyperparameter Optimizers with Transformers [57.35920571605559]
We introduce the OptFormer, the first text-based Transformer HPO framework that provides a universal end-to-end interface for jointly learning policy and function prediction.
Our experiments demonstrate that the OptFormer can imitate at least 7 different HPO algorithms, which can be further improved via its function uncertainty estimates.
arXiv Detail & Related papers (2022-05-26T12:51:32Z) - On the influence of over-parameterization in manifold based surrogates
and deep neural operators [0.0]
We show two approaches for constructing accurate and generalizable approximators for complex physico-chemical processes.
We first propose an extension of the m-PCE, constructing a mapping between latent spaces formed by two separate embeddings of input functions and output QoIs.
We demonstrate that performance m-PCE and DeepONet is comparable for cases of relatively output mappings.
When highly non-smooth dynamics is considered, DeepONet shows higher accuracy.
arXiv Detail & Related papers (2022-03-09T22:27:46Z) - Pseudo-Spherical Contrastive Divergence [119.28384561517292]
We propose pseudo-spherical contrastive divergence (PS-CD) to generalize maximum learning likelihood of energy-based models.
PS-CD avoids the intractable partition function and provides a generalized family of learning objectives.
arXiv Detail & Related papers (2021-11-01T09:17:15Z) - Learnable Bernoulli Dropout for Bayesian Deep Learning [53.79615543862426]
Learnable Bernoulli dropout (LBD) is a new model-agnostic dropout scheme that considers the dropout rates as parameters jointly optimized with other model parameters.
LBD leads to improved accuracy and uncertainty estimates in image classification and semantic segmentation.
arXiv Detail & Related papers (2020-02-12T18:57:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.