Bayesian Kernelized Tensor Factorization as Surrogate for Bayesian
Optimization
- URL: http://arxiv.org/abs/2302.14510v2
- Date: Fri, 26 May 2023 15:33:50 GMT
- Title: Bayesian Kernelized Tensor Factorization as Surrogate for Bayesian
Optimization
- Authors: Mengying Lei and Lijun Sun
- Abstract summary: Kernel optimization (BO) primarily uses Gaussian processes (GP) as the key surrogate model.
In this paper, we propose to use Bayesian Factorization (BKTF) as a new surrogate model -- for BO in a $D$-dimensional product space.
BKTF offers a flexible and highly effective approach for characterizing complex functions with uncertainty quantification.
- Score: 13.896697187967545
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bayesian optimization (BO) primarily uses Gaussian processes (GP) as the key
surrogate model, mostly with a simple stationary and separable kernel function
such as the squared-exponential kernel with automatic relevance determination
(SE-ARD). However, such simple kernel specifications are deficient in learning
functions with complex features, such as being nonstationary, nonseparable, and
multimodal. Approximating such functions using a local GP, even in a
low-dimensional space, requires a large number of samples, not to mention in a
high-dimensional setting. In this paper, we propose to use Bayesian Kernelized
Tensor Factorization (BKTF) -- as a new surrogate model -- for BO in a
$D$-dimensional Cartesian product space. Our key idea is to approximate the
underlying $D$-dimensional solid with a fully Bayesian low-rank tensor CP
decomposition, in which we place GP priors on the latent basis functions for
each dimension to encode local consistency and smoothness. With this
formulation, information from each sample can be shared not only with neighbors
but also across dimensions. Although BKTF no longer has an analytical
posterior, we can still efficiently approximate the posterior distribution
through Markov chain Monte Carlo (MCMC) and obtain prediction and full
uncertainty quantification (UQ). We conduct numerical experiments on both
standard BO test functions and machine learning hyperparameter tuning problems,
and our results show that BKTF offers a flexible and highly effective approach
for characterizing complex functions with UQ, especially in cases where the
initial sample size and budget are severely limited.
Related papers
- Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure.
We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z) - One-Dimensional Deep Image Prior for Curve Fitting of S-Parameters from
Electromagnetic Solvers [57.441926088870325]
Deep Image Prior (DIP) is a technique that optimized the weights of a randomly-d convolutional neural network to fit a signal from noisy or under-determined measurements.
Relative to publicly available implementations of Vector Fitting (VF), our method shows superior performance on nearly all test examples.
arXiv Detail & Related papers (2023-06-06T20:28:37Z) - A Study of Bayesian Neural Network Surrogates for Bayesian Optimization [46.97686790714025]
Bayesian neural networks (BNNs) have recently become practical function approximators.
We study BNNs as alternatives to standard GP surrogates for optimization.
arXiv Detail & Related papers (2023-05-31T17:00:00Z) - Promises and Pitfalls of the Linearized Laplace in Bayesian Optimization [73.80101701431103]
The linearized-Laplace approximation (LLA) has been shown to be effective and efficient in constructing Bayesian neural networks.
We study the usefulness of the LLA in Bayesian optimization and highlight its strong performance and flexibility.
arXiv Detail & Related papers (2023-04-17T14:23:43Z) - Sample-Then-Optimize Batch Neural Thompson Sampling [50.800944138278474]
We introduce two algorithms for black-box optimization based on the Thompson sampling (TS) policy.
To choose an input query, we only need to train an NN and then choose the query by maximizing the trained NN.
Our algorithms sidestep the need to invert the large parameter matrix yet still preserve the validity of the TS policy.
arXiv Detail & Related papers (2022-10-13T09:01:58Z) - Shallow and Deep Nonparametric Convolutions for Gaussian Processes [0.0]
We introduce a nonparametric process convolution formulation for GPs that alleviates weaknesses by using a functional sampling approach.
We propose a composition of these nonparametric convolutions that serves as an alternative to classic deep GP models.
arXiv Detail & Related papers (2022-06-17T19:03:04Z) - Surrogate modeling for Bayesian optimization beyond a single Gaussian
process [62.294228304646516]
We propose a novel Bayesian surrogate model to balance exploration with exploitation of the search space.
To endow function sampling with scalability, random feature-based kernel approximation is leveraged per GP model.
To further establish convergence of the proposed EGP-TS to the global optimum, analysis is conducted based on the notion of Bayesian regret.
arXiv Detail & Related papers (2022-05-27T16:43:10Z) - Trusted-Maximizers Entropy Search for Efficient Bayesian Optimization [39.824086260578646]
This paper presents a novel trusted-maximizers entropy search (TES) acquisition function.
It measures how much an input contributes to the information gain on a query over a finite set of trusted maximizers.
arXiv Detail & Related papers (2021-07-30T07:25:07Z) - Misspecification-robust likelihood-free inference in high dimensions [13.934999364767918]
We introduce an extension of the popular Bayesian optimisation based approach to approximate discrepancy functions in a probabilistic manner.
Our approach achieves computational scalability for higher dimensional parameter spaces by using separate acquisition functions and discrepancies for each parameter.
The method successfully performs computationally efficient inference in a 100-dimensional space on canonical examples and compares favourably to existing modularised ABC methods.
arXiv Detail & Related papers (2020-02-21T16:06:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.