Informed Initialization for Bayesian Optimization and Active Learning
- URL: http://arxiv.org/abs/2510.23681v1
- Date: Mon, 27 Oct 2025 15:05:12 GMT
- Title: Informed Initialization for Bayesian Optimization and Active Learning
- Authors: Carl Hvarfner, David Eriksson, Eytan Bakshy, Max Balandat,
- Abstract summary: We propose a novel acquisition strategy that balances predictive uncertainty reduction with hyperparameter learning using information-theoretic principles.<n>We demonstrate its effectiveness through extensive experiments in active learning and few-shot BO.
- Score: 13.105080815344174
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bayesian Optimization is a widely used method for optimizing expensive black-box functions, relying on probabilistic surrogate models such as Gaussian Processes. The quality of the surrogate model is crucial for good optimization performance, especially in the few-shot setting where only a small number of batches of points can be evaluated. In this setting, the initialization plays a critical role in shaping the surrogate's predictive quality and guiding subsequent optimization. Despite this, practitioners typically rely on (quasi-)random designs to cover the input space. However, such approaches neglect two key factors: (a) space-filling designs may not be desirable to reduce predictive uncertainty, and (b) efficient hyperparameter learning during initialization is essential for high-quality prediction, which may conflict with space-filling designs. To address these limitations, we propose Hyperparameter-Informed Predictive Exploration (HIPE), a novel acquisition strategy that balances predictive uncertainty reduction with hyperparameter learning using information-theoretic principles. We derive a closed-form expression for HIPE in the Gaussian Process setting and demonstrate its effectiveness through extensive experiments in active learning and few-shot BO. Our results show that HIPE outperforms standard initialization strategies in terms of predictive accuracy, hyperparameter identification, and subsequent optimization performance, particularly in large-batch, few-shot settings relevant to many real-world Bayesian Optimization applications.
Related papers
- Goal-Oriented Influence-Maximizing Data Acquisition for Learning and Optimization [28.53710231018475]
We propose an active acquisition algorithm that avoids explicit posterior inference while remaining uncertainty-aware through inverse curvature.<n>GOIMDA selects inputs by maximizing their expected influence on a user-specified goal functional.<n>We show theoretically that, for generalized linear models, GOIMDA approximates predictive-entropy minimization up to a correction term accounting for goal alignment and prediction bias.
arXiv Detail & Related papers (2026-02-23T07:57:11Z) - HyperQ-Opt: Q-learning for Hyperparameter Optimization [0.0]
This paper presents a novel perspective on HPO by formulating it as a sequential decision-making problem and leveraging Q-learning, a reinforcement learning technique.<n>The approaches are evaluated for their ability to find optimal or near-optimal configurations within a limited number of trials.<n>By shifting the paradigm toward policy-based optimization, this work contributes to advancing HPO methods for scalable and efficient machine learning applications.
arXiv Detail & Related papers (2024-12-23T18:22:34Z) - Conditional Diffusion-based Parameter Generation for Quantum Approximate Optimization Algorithm [7.48670688063184]
The Quantum Approximate Optimization (OAQA) is a hybrid that shows promise in efficiently solving the MaxCut problem.<n>The generative learning model specifically the denoising diffusion (DDPM) is to learn the distribution parameters on the graph dataset.
arXiv Detail & Related papers (2024-07-17T01:18:27Z) - End-to-End Learning for Fair Multiobjective Optimization Under
Uncertainty [55.04219793298687]
The Predict-Then-Forecast (PtO) paradigm in machine learning aims to maximize downstream decision quality.
This paper extends the PtO methodology to optimization problems with nondifferentiable Ordered Weighted Averaging (OWA) objectives.
It shows how optimization of OWA functions can be effectively integrated with parametric prediction for fair and robust optimization under uncertainty.
arXiv Detail & Related papers (2024-02-12T16:33:35Z) - Pseudo-Bayesian Optimization [7.556071491014536]
We study an axiomatic framework that elicits the minimal requirements to guarantee black-box optimization convergence.<n>We show how using simple local regression, and a suitable "randomized prior" construction to quantify uncertainty, not only guarantees convergence but also consistently outperforms state-of-the-art benchmarks.
arXiv Detail & Related papers (2023-10-15T07:55:28Z) - Prediction-Oriented Bayesian Active Learning [51.426960808684655]
Expected predictive information gain (EPIG) is an acquisition function that measures information gain in the space of predictions rather than parameters.
EPIG leads to stronger predictive performance compared with BALD across a range of datasets and models.
arXiv Detail & Related papers (2023-04-17T10:59:57Z) - Generalizing Bayesian Optimization with Decision-theoretic Entropies [102.82152945324381]
We consider a generalization of Shannon entropy from work in statistical decision theory.
We first show that special cases of this entropy lead to popular acquisition functions used in BO procedures.
We then show how alternative choices for the loss yield a flexible family of acquisition functions.
arXiv Detail & Related papers (2022-10-04T04:43:58Z) - Optimization of Annealed Importance Sampling Hyperparameters [77.34726150561087]
Annealed Importance Sampling (AIS) is a popular algorithm used to estimates the intractable marginal likelihood of deep generative models.
We present a parameteric AIS process with flexible intermediary distributions and optimize the bridging distributions to use fewer number of steps for sampling.
We assess the performance of our optimized AIS for marginal likelihood estimation of deep generative models and compare it to other estimators.
arXiv Detail & Related papers (2022-09-27T07:58:25Z) - Pre-training helps Bayesian optimization too [49.28382118032923]
We seek an alternative practice for setting functional priors.
In particular, we consider the scenario where we have data from similar functions that allow us to pre-train a tighter distribution a priori.
Our results show that our method is able to locate good hyper parameters at least 3 times more efficiently than the best competing methods.
arXiv Detail & Related papers (2022-07-07T04:42:54Z) - Multi-Objective Hyperparameter Optimization in Machine Learning -- An Overview [10.081056751778712]
We introduce the basics of multi-objective hyperparameter optimization and motivate its usefulness in applied ML.
We provide an extensive survey of existing optimization strategies, both from the domain of evolutionary algorithms and Bayesian optimization.
We illustrate the utility of MOO in several specific ML applications, considering objectives such as operating conditions, prediction time, sparseness, fairness, interpretability and robustness.
arXiv Detail & Related papers (2022-06-15T10:23:19Z) - Towards Learning Universal Hyperparameter Optimizers with Transformers [57.35920571605559]
We introduce the OptFormer, the first text-based Transformer HPO framework that provides a universal end-to-end interface for jointly learning policy and function prediction.
Our experiments demonstrate that the OptFormer can imitate at least 7 different HPO algorithms, which can be further improved via its function uncertainty estimates.
arXiv Detail & Related papers (2022-05-26T12:51:32Z) - Real-Time Optimization Meets Bayesian Optimization and Derivative-Free
Optimization: A Tale of Modifier Adaptation [0.0]
This paper investigates a new class of modifier-adaptation schemes to overcome plant-model mismatch in real-time optimization of uncertain processes.
The proposed schemes embed a physical model and rely on trust-region ideas to minimize risk during the exploration.
The benefits of using an acquisition function, knowing the process noise level, or specifying a nominal process model are illustrated.
arXiv Detail & Related papers (2020-09-18T12:57:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.