Marginalising over Stationary Kernels with Bayesian Quadrature
- URL: http://arxiv.org/abs/2106.07452v1
- Date: Mon, 14 Jun 2021 14:23:34 GMT
- Title: Marginalising over Stationary Kernels with Bayesian Quadrature
- Authors: Saad Hamid, Sebastian Schulze, Michael A. Osborne, Stephen J. Roberts
- Abstract summary: Marginalising over families of Gaussian Process kernels produces flexible model classes with well-calibrated uncertainty estimates.
We propose a Bayesian Quadrature scheme to make this marginalisation more efficient and thereby more practical.
- Score: 36.456528055624766
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Marginalising over families of Gaussian Process kernels produces flexible
model classes with well-calibrated uncertainty estimates. Existing approaches
require likelihood evaluations of many kernels, rendering them prohibitively
expensive for larger datasets. We propose a Bayesian Quadrature scheme to make
this marginalisation more efficient and thereby more practical. Through use of
the maximum mean discrepancies between distributions, we define a kernel over
kernels that captures invariances between Spectral Mixture (SM) Kernels. Kernel
samples are selected by generalising an information-theoretic acquisition
function for warped Bayesian Quadrature. We show that our framework achieves
more accurate predictions with better calibrated uncertainty than
state-of-the-art baselines, especially when given limited (wall-clock) time
budgets.
Related papers
- A Unifying Perspective on Non-Stationary Kernels for Deeper Gaussian Processes [0.9558392439655016]
We show a variety of kernels in action using representative datasets, carefully study their properties, and compare their performances.
Based on our findings, we propose a new kernel that combines some of the identified advantages of existing kernels.
arXiv Detail & Related papers (2023-09-18T18:34:51Z) - Meta-Learning Hypothesis Spaces for Sequential Decision-making [79.73213540203389]
We propose to meta-learn a kernel from offline data (Meta-KeL)
Under mild conditions, we guarantee that our estimated RKHS yields valid confidence sets.
We also empirically evaluate the effectiveness of our approach on a Bayesian optimization task.
arXiv Detail & Related papers (2022-02-01T17:46:51Z) - Optimal policy evaluation using kernel-based temporal difference methods [78.83926562536791]
We use kernel Hilbert spaces for estimating the value function of an infinite-horizon discounted Markov reward process.
We derive a non-asymptotic upper bound on the error with explicit dependence on the eigenvalues of the associated kernel operator.
We prove minimax lower bounds over sub-classes of MRPs.
arXiv Detail & Related papers (2021-09-24T14:48:20Z) - Fast Batch Nuclear-norm Maximization and Minimization for Robust Domain
Adaptation [154.2195491708548]
We study the prediction discriminability and diversity by studying the structure of the classification output matrix of a randomly selected data batch.
We propose Batch Nuclear-norm Maximization and Minimization, which performs nuclear-norm on the target output matrix to enhance the target prediction ability.
Experiments show that our method could boost the adaptation accuracy and robustness under three typical domain adaptation scenarios.
arXiv Detail & Related papers (2021-07-13T15:08:32Z) - A Note on Optimizing Distributions using Kernel Mean Embeddings [94.96262888797257]
Kernel mean embeddings represent probability measures by their infinite-dimensional mean embeddings in a reproducing kernel Hilbert space.
We show that when the kernel is characteristic, distributions with a kernel sum-of-squares density are dense.
We provide algorithms to optimize such distributions in the finite-sample setting.
arXiv Detail & Related papers (2021-06-18T08:33:45Z) - Towards Unbiased Random Features with Lower Variance For Stationary
Indefinite Kernels [26.57122949130266]
Our algorithm achieves lower variance and approximation error compared with the existing kernel approximation methods.
With better approximation to the originally selected kernels, improved classification accuracy and regression ability is obtained.
arXiv Detail & Related papers (2021-04-13T13:56:50Z) - Kernel k-Means, By All Means: Algorithms and Strong Consistency [21.013169939337583]
Kernel $k$ clustering is a powerful tool for unsupervised learning of non-linear data.
In this paper, we generalize results leveraging a general family of means to combat sub-optimal local solutions.
Our algorithm makes use of majorization-minimization (MM) to better solve this non-linear separation problem.
arXiv Detail & Related papers (2020-11-12T16:07:18Z) - Large-Scale Methods for Distributionally Robust Optimization [53.98643772533416]
We prove that our algorithms require a number of evaluations gradient independent of training set size and number of parameters.
Experiments on MNIST and ImageNet confirm the theoretical scaling of our algorithms, which are 9--36 times more efficient than full-batch methods.
arXiv Detail & Related papers (2020-10-12T17:41:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.