Quadruply Stochastic Gaussian Processes
- URL: http://arxiv.org/abs/2006.03015v1
- Date: Thu, 4 Jun 2020 17:06:25 GMT
- Title: Quadruply Stochastic Gaussian Processes
- Authors: Trefor W. Evans and Prasanth B. Nair
- Abstract summary: We introduce a variational inference procedure for training scalable Gaussian process (GP) models whose per-iteration complexity is independent of both the number of training points, $n$, and the number basis functions used in the kernel approximation, $m$.
We demonstrate accurate inference on large classification and regression datasets using GPs and relevance vector machines with up to $m = 107$ basis functions.
- Score: 10.152838128195466
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a stochastic variational inference procedure for training
scalable Gaussian process (GP) models whose per-iteration complexity is
independent of both the number of training points, $n$, and the number basis
functions used in the kernel approximation, $m$. Our central contributions
include an unbiased stochastic estimator of the evidence lower bound (ELBO) for
a Gaussian likelihood, as well as a stochastic estimator that lower bounds the
ELBO for several other likelihoods such as Laplace and logistic. Independence
of the stochastic optimization update complexity on $n$ and $m$ enables
inference on huge datasets using large capacity GP models. We demonstrate
accurate inference on large classification and regression datasets using GPs
and relevance vector machines with up to $m = 10^7$ basis functions.
Related papers
- Computation-Aware Gaussian Processes: Model Selection And Linear-Time Inference [55.150117654242706]
We show that model selection for computation-aware GPs trained on 1.8 million data points can be done within a few hours on a single GPU.
As a result of this work, Gaussian processes can be trained on large-scale datasets without significantly compromising their ability to quantify uncertainty.
arXiv Detail & Related papers (2024-11-01T21:11:48Z) - BI-EqNO: Generalized Approximate Bayesian Inference with an Equivariant Neural Operator Framework [9.408644291433752]
We introduce BI-EqNO, an equivariant neural operator framework for generalized approximate Bayesian inference.
BI-EqNO transforms priors into posteriors on conditioned observation data through data-driven training.
We demonstrate BI-EqNO's utility through two examples: (1) as a generalized Gaussian process (gGP) for regression, and (2) as an ensemble neural filter (EnNF) for sequential data assimilation.
arXiv Detail & Related papers (2024-10-21T18:39:16Z) - Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure.
We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z) - Closed-form Filtering for Non-linear Systems [83.91296397912218]
We propose a new class of filters based on Gaussian PSD Models, which offer several advantages in terms of density approximation and computational efficiency.
We show that filtering can be efficiently performed in closed form when transitions and observations are Gaussian PSD Models.
Our proposed estimator enjoys strong theoretical guarantees, with estimation error that depends on the quality of the approximation and is adaptive to the regularity of the transition probabilities.
arXiv Detail & Related papers (2024-02-15T08:51:49Z) - Surrogate modeling for Bayesian optimization beyond a single Gaussian
process [62.294228304646516]
We propose a novel Bayesian surrogate model to balance exploration with exploitation of the search space.
To endow function sampling with scalability, random feature-based kernel approximation is leveraged per GP model.
To further establish convergence of the proposed EGP-TS to the global optimum, analysis is conducted based on the notion of Bayesian regret.
arXiv Detail & Related papers (2022-05-27T16:43:10Z) - Fast variable selection makes scalable Gaussian process BSS-ANOVA a
speedy and accurate choice for tabular and time series regression [0.0]
Gaussian processes (GPs) are non-parametric regression engines with a long history.
One of a number of scalable GP approaches is the Karhunen-Lo'eve (KL) decomposed kernel BSS-ANOVA, developed in 2009.
A new method of forward variable selection, quickly and effectively limits the number of terms, yielding a method with competitive accuracies.
arXiv Detail & Related papers (2022-05-26T23:41:43Z) - Exact Gaussian Processes for Massive Datasets via Non-Stationary
Sparsity-Discovering Kernels [0.0]
We propose to take advantage of naturally-structured sparsity by allowing the kernel to discover -- instead of induce -- sparse structure.
The principle of ultra-flexible, compactly-supported, and non-stationary kernels, combined with HPC and constrained optimization, lets us scale exact GPs well beyond 5 million data points.
arXiv Detail & Related papers (2022-05-18T16:56:53Z) - Gaussian Process Inference Using Mini-batch Stochastic Gradient Descent:
Convergence Guarantees and Empirical Benefits [21.353189917487512]
gradient descent (SGD) and its variants have established themselves as the go-to algorithms for machine learning problems.
We take a step forward by proving minibatch SGD converges to a critical point of the full log-likelihood loss function.
Our theoretical guarantees hold provided that the kernel functions exhibit exponential or eigendecay.
arXiv Detail & Related papers (2021-11-19T22:28:47Z) - Non-Gaussian Gaussian Processes for Few-Shot Regression [71.33730039795921]
We propose an invertible ODE-based mapping that operates on each component of the random variable vectors and shares the parameters across all of them.
NGGPs outperform the competing state-of-the-art approaches on a diversified set of benchmarks and applications.
arXiv Detail & Related papers (2021-10-26T10:45:25Z) - Scalable Variational Gaussian Processes via Harmonic Kernel
Decomposition [54.07797071198249]
We introduce a new scalable variational Gaussian process approximation which provides a high fidelity approximation while retaining general applicability.
We demonstrate that, on a range of regression and classification problems, our approach can exploit input space symmetries such as translations and reflections.
Notably, our approach achieves state-of-the-art results on CIFAR-10 among pure GP models.
arXiv Detail & Related papers (2021-06-10T18:17:57Z) - $\pi$VAE: a stochastic process prior for Bayesian deep learning with
MCMC [2.4792948967354236]
We propose a novel variational autoencoder called the prior encodingal autoencoder ($pi$VAE)
We show that our framework can accurately learn expressive function classes such as Gaussian processes, but also properties of functions to enable statistical inference.
Perhaps most usefully, we demonstrate that the low dimensional distributed latent space representation learnt provides an elegant and scalable means of performing inference for processes within programming languages such as Stan.
arXiv Detail & Related papers (2020-02-17T10:23:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.