Understanding High-Dimensional Bayesian Optimization
- URL: http://arxiv.org/abs/2502.09198v1
- Date: Thu, 13 Feb 2025 11:37:55 GMT
- Title: Understanding High-Dimensional Bayesian Optimization
- Authors: Leonard Papenmeier, Matthias Poloczek, Luigi Nardi,
- Abstract summary: Recent work reported that simple Bayesian optimization methods perform well for high-dimensional real-world tasks.<n>We identify fundamental challenges that arise in high-dimensional Bayesian optimization and explain why recent methods succeed.<n>We propose a simple variant of maximum likelihood estimation called MSR that leverages these findings to achieve state-of-the-art performance.
- Score: 8.07879230384311
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent work reported that simple Bayesian optimization methods perform well for high-dimensional real-world tasks, seemingly contradicting prior work and tribal knowledge. This paper investigates the 'why'. We identify fundamental challenges that arise in high-dimensional Bayesian optimization and explain why recent methods succeed. Our analysis shows that vanishing gradients caused by Gaussian process initialization schemes play a major role in the failures of high-dimensional Bayesian optimization and that methods that promote local search behaviors are better suited for the task. We find that maximum likelihood estimation of Gaussian process length scales suffices for state-of-the-art performance. Based on this, we propose a simple variant of maximum likelihood estimation called MSR that leverages these findings to achieve state-of-the-art performance on a comprehensive set of real-world applications. We also present targeted experiments to illustrate and confirm our findings.
Related papers
- BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RL [15.127370150885348]
We present a generalized deep Q-learning framework and propose textitBOFormer, which substantiates this framework for MOBO via sequence modeling.<n>Through extensive evaluation, we demonstrate that BOFormer constantly outperforms the benchmark rule-based and learning-based algorithms.
arXiv Detail & Related papers (2025-05-28T05:00:50Z) - Gradient-based Sample Selection for Faster Bayesian Optimization [11.242721310713963]
In large-budget scenarios, directly employing the standard GP model faces significant challenges in computational time and resource requirements.<n>We propose a novel approach, gradient-based sample selection Bayesian Optimization (GSSBO), to enhance the computational efficiency of BO.<n>Our approach significantly reduces the computational cost of GP fitting in BO while maintaining optimization performance comparable to baseline methods.
arXiv Detail & Related papers (2025-04-10T13:38:15Z) - High Dimensional Bayesian Optimization using Lasso Variable Selection [9.051539805042651]
We introduce a novel method that identifies important variables by estimating the length scales of Gaussian process kernels.
We demonstrate that our proposed method achieves cumulative regret with a sublinear growth rate in the worst case.
Experiments on high-dimensional synthetic functions and real-world problems show that our method achieves state-of-the-art performance.
arXiv Detail & Related papers (2025-04-02T13:54:04Z) - Near-optimal Active Reconstruction [3.037563407333583]
We design an algorithm for the Next Best View (NBV) problem in the context of active object reconstruction.
We rigorously derive sublinear bounds for the cumulative regret of our algorithm, which guarantees near-optimality.
We evaluate the performance of our algorithm empirically within our simulation framework.
arXiv Detail & Related papers (2025-03-24T09:17:53Z) - Revisiting BPR: A Replicability Study of a Common Recommender System Baseline [78.00363373925758]
We study the features of the BPR model, indicating their impact on its performance, and investigate open-source BPR implementations.
Our analysis reveals inconsistencies between these implementations and the original BPR paper, leading to a significant decrease in performance of up to 50% for specific implementations.
We show that the BPR model can achieve performance levels close to state-of-the-art methods on the top-n recommendation tasks and even outperform them on specific datasets.
arXiv Detail & Related papers (2024-09-21T18:39:53Z) - See Further for Parameter Efficient Fine-tuning by Standing on the Shoulders of Decomposition [56.87609859444084]
parameter-efficient fine-tuning (PEFT) focuses on optimizing a select subset of parameters while keeping the rest fixed, significantly lowering computational and storage overheads.<n>We take the first step to unify all approaches by dissecting them from a decomposition perspective.<n>We introduce two novel PEFT methods alongside a simple yet effective framework designed to enhance the performance of PEFT techniques across various applications.
arXiv Detail & Related papers (2024-07-07T15:44:42Z) - Discovering Preference Optimization Algorithms with and for Large Language Models [50.843710797024805]
offline preference optimization is a key method for enhancing and controlling the quality of Large Language Model (LLM) outputs.
We perform objective discovery to automatically discover new state-of-the-art preference optimization algorithms without (expert) human intervention.
Experiments demonstrate the state-of-the-art performance of DiscoPOP, a novel algorithm that adaptively blends logistic and exponential losses.
arXiv Detail & Related papers (2024-06-12T16:58:41Z) - Vanilla Bayesian Optimization Performs Great in High Dimensions [5.7574684256411786]
High-dimensional problems have long been considered the Achilles' heel of Bayesian optimization algorithms.<n>We show how existing algorithms address these degeneracies through the lens of lowering the model complexity.
arXiv Detail & Related papers (2024-02-03T18:19:46Z) - Domain Knowledge Injection in Bayesian Search for New Materials [0.0]
We propose DKIBO, a Bayesian optimization (BO) algorithm that accommodates domain knowledge to tune exploration in the search space.
We empirically demonstrate the practical utility of the proposed method by successfully injecting domain knowledge in a materials design task.
arXiv Detail & Related papers (2023-11-26T01:55:55Z) - Provably Efficient Bayesian Optimization with Unknown Gaussian Process Hyperparameter Estimation [44.53678257757108]
We propose a new BO method that can sub-linearly converge to the objective function's global optimum.
Our method uses a multi-armed bandit technique (EXP3) to add random data points to the BO process.
We demonstrate empirically that our method outperforms existing approaches on various synthetic and real-world problems.
arXiv Detail & Related papers (2023-06-12T03:35:45Z) - Scalable Bayesian Meta-Learning through Generalized Implicit Gradients [64.21628447579772]
Implicit Bayesian meta-learning (iBaML) method broadens the scope of learnable priors, but also quantifies the associated uncertainty.
Analytical error bounds are established to demonstrate the precision and efficiency of the generalized implicit gradient over the explicit one.
arXiv Detail & Related papers (2023-03-31T02:10:30Z) - A Particle-based Sparse Gaussian Process Optimizer [5.672919245950197]
We present a new swarm-swarm-based framework utilizing the underlying dynamical process of descent.
The biggest advantage of this approach is greater exploration around the current state before deciding descent descent.
arXiv Detail & Related papers (2022-11-26T09:06:15Z) - Generalizing Bayesian Optimization with Decision-theoretic Entropies [102.82152945324381]
We consider a generalization of Shannon entropy from work in statistical decision theory.
We first show that special cases of this entropy lead to popular acquisition functions used in BO procedures.
We then show how alternative choices for the loss yield a flexible family of acquisition functions.
arXiv Detail & Related papers (2022-10-04T04:43:58Z) - Outlier-Robust Sparse Estimation via Non-Convex Optimization [73.18654719887205]
We explore the connection between high-dimensional statistics and non-robust optimization in the presence of sparsity constraints.
We develop novel and simple optimization formulations for these problems.
As a corollary, we obtain that any first-order method that efficiently converges to station yields an efficient algorithm for these tasks.
arXiv Detail & Related papers (2021-09-23T17:38:24Z) - High-Dimensional Bayesian Optimisation with Variational Autoencoders and
Deep Metric Learning [119.91679702854499]
We introduce a method based on deep metric learning to perform Bayesian optimisation over high-dimensional, structured input spaces.
We achieve such an inductive bias using just 1% of the available labelled data.
As an empirical contribution, we present state-of-the-art results on real-world high-dimensional black-box optimisation problems.
arXiv Detail & Related papers (2021-06-07T13:35:47Z) - Directed particle swarm optimization with Gaussian-process-based
function forecasting [15.733136147164032]
Particle swarm optimization (PSO) is an iterative search method that moves a set of candidate solution around a search-space towards the best known global and local solutions with randomized step lengths.
We show that our algorithm attains desirable properties for exploratory and exploitative behavior.
arXiv Detail & Related papers (2021-02-08T13:02:57Z) - Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box
Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information.
We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z) - Provably Efficient Reward-Agnostic Navigation with Linear Value
Iteration [143.43658264904863]
We show how iteration under a more standard notion of low inherent Bellman error, typically employed in least-square value-style algorithms, can provide strong PAC guarantees on learning a near optimal value function.
We present a computationally tractable algorithm for the reward-free setting and show how it can be used to learn a near optimal policy for any (linear) reward function.
arXiv Detail & Related papers (2020-08-18T04:34:21Z) - On the Convergence Rate of Projected Gradient Descent for a
Back-Projection based Objective [58.33065918353532]
We consider a back-projection based fidelity term as an alternative to the common least squares (LS)
We show that using the BP term, rather than the LS term, requires fewer iterations of optimization algorithms.
arXiv Detail & Related papers (2020-05-03T00:58:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.