Related papers: Bayesian Optimization over Bounded Domains with the Beta Product Kernel

Bayesian Optimization over Bounded Domains with the Beta Product Kernel

URL: http://arxiv.org/abs/2506.16316v1
Date: Thu, 19 Jun 2025 13:45:57 GMT
Title: Bayesian Optimization over Bounded Domains with the Beta Product Kernel
Authors: Huy Hoang Nguyen, Han Zhou, Matthew B. Blaschko, Aleksei Tiulpin,
Abstract summary: We introduce the Beta kernel, a non-stationary kernel induced by a product of Beta distribution density functions.<n>We show that our kernel consistently outperforms a wide range of kernels, including the well-known Mat'ern and RBF.
Score: 15.745978363320463
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Bayesian optimization with Gaussian processes (GP) is commonly used to optimize black-box functions. The Mat\'ern and the Radial Basis Function (RBF) covariance functions are used frequently, but they do not make any assumptions about the domain of the function, which may limit their applicability in bounded domains. To address the limitation, we introduce the Beta kernel, a non-stationary kernel induced by a product of Beta distribution density functions. Such a formulation allows our kernel to naturally model functions on bounded domains. We present statistical evidence supporting the hypothesis that the kernel exhibits an exponential eigendecay rate, based on empirical analyses of its spectral properties across different settings. Our experimental results demonstrate the robustness of the Beta kernel in modeling functions with optima located near the faces or vertices of the unit hypercube. The experiments show that our kernel consistently outperforms a wide range of kernels, including the well-known Mat\'ern and RBF, in different problems, including synthetic function optimization and the compression of vision and language models.

Related papers

Feature maps for the Laplacian kernel and its generalizations [3.671202973761375]
Unlike the Gaussian kernel, the Laplacian kernel is not separable.<n>We provide random features for the Laplacian kernel and its two generalizations.<n>We demonstrate the efficacy of these random feature maps on real datasets.
arXiv Detail & Related papers (2025-02-21T16:36:20Z)
Promises and Pitfalls of the Linearized Laplace in Bayesian Optimization [73.80101701431103]
The linearized-Laplace approximation (LLA) has been shown to be effective and efficient in constructing Bayesian neural networks. We study the usefulness of the LLA in Bayesian optimization and highlight its strong performance and flexibility.
arXiv Detail & Related papers (2023-04-17T14:23:43Z)
Experimental Design for Linear Functionals in Reproducing Kernel Hilbert Spaces [102.08678737900541]
We provide algorithms for constructing bias-aware designs for linear functionals. We derive non-asymptotic confidence sets for fixed and adaptive designs under sub-Gaussian noise.
arXiv Detail & Related papers (2022-05-26T20:56:25Z)
Random Gegenbauer Features for Scalable Kernel Methods [11.370390549286757]
We propose efficient random features for approximating a new and rich class of kernel functions that we refer to as Generalized Zonal Kernels (GZK) Our proposed GZK family generalizes the zonal kernels by introducing factors in their Gegenbauer series expansion. We show that our proposed features outperform recent kernel approximation methods.
arXiv Detail & Related papers (2022-02-07T19:30:36Z)
Revisiting Memory Efficient Kernel Approximation: An Indefinite Learning Perspective [0.8594140167290097]
Matrix approximations are a key element in large-scale machine learning approaches. We extend MEKA to be applicable not only for shift-invariant kernels but also for non-stationary kernels. We present a Lanczos-based estimation of a spectrum shift to develop a stable positive semi-definite MEKA approximation.
arXiv Detail & Related papers (2021-12-18T10:01:34Z)
Optimal policy evaluation using kernel-based temporal difference methods [78.83926562536791]
We use kernel Hilbert spaces for estimating the value function of an infinite-horizon discounted Markov reward process. We derive a non-asymptotic upper bound on the error with explicit dependence on the eigenvalues of the associated kernel operator. We prove minimax lower bounds over sub-classes of MRPs.
arXiv Detail & Related papers (2021-09-24T14:48:20Z)
A Robust Asymmetric Kernel Function for Bayesian Optimization, with Application to Image Defect Detection in Manufacturing Systems [2.4278445972594525]
We propose a robust kernel function, Asymmetric Elastic Net Radial Basis Function (AEN-RBF) We show theoretically that AEN-RBF can realize smaller mean squared prediction error under mild conditions. We also show that the AEN-RBF kernel function is less sensitive to outliers.
arXiv Detail & Related papers (2021-09-22T17:59:05Z)
A Mean-Field Theory for Learning the Sch\"{o}nberg Measure of Radial Basis Functions [13.503048325896174]
We learn the distribution in the Sch"onberg integral representation of the radial basis functions from training samples. We prove that in the scaling limits, the empirical measure of the Langevin particles converges to the law of a reflected Ito diffusion-drift process.
arXiv Detail & Related papers (2020-06-23T21:04:48Z)
The Convergence Indicator: Improved and completely characterized parameter bounds for actual convergence of Particle Swarm Optimization [68.8204255655161]
We introduce a new convergence indicator that can be used to calculate whether the particles will finally converge to a single point or diverge. Using this convergence indicator we provide the actual bounds completely characterizing parameter regions that lead to a converging swarm.
arXiv Detail & Related papers (2020-06-06T19:08:05Z)
SLEIPNIR: Deterministic and Provably Accurate Feature Expansion for Gaussian Process Regression with Derivatives [86.01677297601624]
We propose a novel approach for scaling GP regression with derivatives based on quadrature Fourier features. We prove deterministic, non-asymptotic and exponentially fast decaying error bounds which apply for both the approximated kernel as well as the approximated posterior.
arXiv Detail & Related papers (2020-03-05T14:33:20Z)
Improved guarantees and a multiple-descent curve for Column Subset Selection and the Nystr\"om method [76.73096213472897]
We develop techniques which exploit spectral properties of the data matrix to obtain improved approximation guarantees. Our approach leads to significantly better bounds for datasets with known rates of singular value decay. We show that both our improved bounds and the multiple-descent curve can be observed on real datasets simply by varying the RBF parameter.
arXiv Detail & Related papers (2020-02-21T00:43:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.