Related papers: Sparse Gaussian Process Hyperparameters: Optimize or Integrate?

Sparse Gaussian Process Hyperparameters: Optimize or Integrate?

URL: http://arxiv.org/abs/2211.02476v1
Date: Fri, 4 Nov 2022 14:06:59 GMT
Title: Sparse Gaussian Process Hyperparameters: Optimize or Integrate?
Authors: Vidhi Lalchand, Wessel P. Bruinsma, David R. Burt, Carl E. Rasmussen
Abstract summary: We propose an algorithm for sparse Gaussian process regression which leverages MCMC to sample from the hyperparameter posterior. We compare this scheme against natural baselines in literature along with variational GPs (SVGPs) along with an extensive computational analysis.
Score: 5.949779668853556
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The kernel function and its hyperparameters are the central model selection choice in a Gaussian proces (Rasmussen and Williams, 2006). Typically, the hyperparameters of the kernel are chosen by maximising the marginal likelihood, an approach known as Type-II maximum likelihood (ML-II). However, ML-II does not account for hyperparameter uncertainty, and it is well-known that this can lead to severely biased estimates and an underestimation of predictive uncertainty. While there are several works which employ a fully Bayesian characterisation of GPs, relatively few propose such approaches for the sparse GPs paradigm. In this work we propose an algorithm for sparse Gaussian process regression which leverages MCMC to sample from the hyperparameter posterior within the variational inducing point framework of Titsias (2009). This work is closely related to Hensman et al. (2015b) but side-steps the need to sample the inducing points, thereby significantly improving sampling efficiency in the Gaussian likelihood case. We compare this scheme against natural baselines in literature along with stochastic variational GPs (SVGPs) along with an extensive computational analysis.

Related papers

Stochastic Marginal Likelihood Gradients using Neural Tangent Kernels [78.6096486885658]
We introduce lower bounds to the linearized Laplace approximation of the marginal likelihood. These bounds are amenable togradient-based optimization and allow to trade off estimation accuracy against computational complexity.
arXiv Detail & Related papers (2023-06-06T19:02:57Z)
Optimization of Annealed Importance Sampling Hyperparameters [77.34726150561087]
Annealed Importance Sampling (AIS) is a popular algorithm used to estimates the intractable marginal likelihood of deep generative models. We present a parameteric AIS process with flexible intermediary distributions and optimize the bridging distributions to use fewer number of steps for sampling. We assess the performance of our optimized AIS for marginal likelihood estimation of deep generative models and compare it to other estimators.
arXiv Detail & Related papers (2022-09-27T07:58:25Z)
Scalable Gaussian Process Hyperparameter Optimization via Coverage Regularization [0.0]
We present a novel algorithm which estimates the smoothness and length-scale parameters in the Matern kernel in order to improve robustness of the resulting prediction uncertainties. We achieve improved UQ over leave-one-out likelihood while maintaining a high degree of scalability as demonstrated in numerical experiments.
arXiv Detail & Related papers (2022-09-22T19:23:37Z)
Surrogate modeling for Bayesian optimization beyond a single Gaussian process [62.294228304646516]
We propose a novel Bayesian surrogate model to balance exploration with exploitation of the search space. To endow function sampling with scalability, random feature-based kernel approximation is leveraged per GP model. To further establish convergence of the proposed EGP-TS to the global optimum, analysis is conducted based on the notion of Bayesian regret.
arXiv Detail & Related papers (2022-05-27T16:43:10Z)
Non-Gaussian Gaussian Processes for Few-Shot Regression [71.33730039795921]
We propose an invertible ODE-based mapping that operates on each component of the random variable vectors and shares the parameters across all of them. NGGPs outperform the competing state-of-the-art approaches on a diversified set of benchmarks and applications.
arXiv Detail & Related papers (2021-10-26T10:45:25Z)
Gaussian Process Uniform Error Bounds with Unknown Hyperparameters for Safety-Critical Applications [71.23286211775084]
We introduce robust Gaussian process uniform error bounds in settings with unknown hyper parameters. Our approach computes a confidence region in the space of hyper parameters, which enables us to obtain a probabilistic upper bound for the model error. Experiments show that the bound performs significantly better than vanilla and fully Bayesian processes.
arXiv Detail & Related papers (2021-09-06T17:10:01Z)
Scalable Variational Gaussian Processes via Harmonic Kernel Decomposition [54.07797071198249]
We introduce a new scalable variational Gaussian process approximation which provides a high fidelity approximation while retaining general applicability. We demonstrate that, on a range of regression and classification problems, our approach can exploit input space symmetries such as translations and reflections. Notably, our approach achieves state-of-the-art results on CIFAR-10 among pure GP models.
arXiv Detail & Related papers (2021-06-10T18:17:57Z)
On MCMC for variationally sparse Gaussian processes: A pseudo-marginal approach [0.76146285961466]
Gaussian processes (GPs) are frequently used in machine learning and statistics to construct powerful models. We propose a pseudo-marginal (PM) scheme that offers exact inference as well as computational gains through doubly estimators for the likelihood and large datasets.
arXiv Detail & Related papers (2021-03-04T20:48:29Z)
Gaussian Process Latent Class Choice Models [7.992550355579791]
We present a non-parametric class of probabilistic machine learning within discrete choice models (DCMs) The proposed model would assign individuals probabilistically to behaviorally homogeneous clusters (latent classes) using GPs. The model is tested on two different mode choice applications and compared against different LCCM benchmarks.
arXiv Detail & Related papers (2021-01-28T19:56:42Z)
Marginalised Gaussian Processes with Nested Sampling [10.495114898741203]
Gaussian Process (GPs) models are a rich distribution over functions with inductive biases controlled by a kernel function. This work presents an alternative learning procedure where the hyperparameters of the kernel function are marginalised using Nested Sampling (NS)
arXiv Detail & Related papers (2020-10-30T16:04:35Z)
Approximate Inference for Fully Bayesian Gaussian Process Regression [11.47317712333228]
Learning in Gaussian Process models occurs through the adaptation of hyper parameters of the mean and the covariance function. An alternative learning procedure is to infer the posterior over hyper parameters in a hierarchical specification of GPs we call textitFully Bayesian Gaussian Process Regression (GPR) We analyze the predictive performance for fully Bayesian GPR on a range of benchmark data sets.
arXiv Detail & Related papers (2019-12-31T17:18:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.