Sparse Orthogonal Variational Inference for Gaussian Processes
- URL: http://arxiv.org/abs/1910.10596v5
- Date: Sat, 24 Feb 2024 23:32:32 GMT
- Title: Sparse Orthogonal Variational Inference for Gaussian Processes
- Authors: Jiaxin Shi, Michalis K. Titsias, Andriy Mnih
- Abstract summary: We introduce a new interpretation of sparse variational approximations for Gaussian processes using inducing points.
We show that this formulation recovers existing approximations and at the same time allows to obtain tighter lower bounds on the marginal likelihood and new variational inference algorithms.
- Score: 34.476453597078894
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a new interpretation of sparse variational approximations for
Gaussian processes using inducing points, which can lead to more scalable
algorithms than previous methods. It is based on decomposing a Gaussian process
as a sum of two independent processes: one spanned by a finite basis of
inducing points and the other capturing the remaining variation. We show that
this formulation recovers existing approximations and at the same time allows
to obtain tighter lower bounds on the marginal likelihood and new stochastic
variational inference algorithms. We demonstrate the efficiency of these
algorithms in several Gaussian process models ranging from standard regression
to multi-class classification using (deep) convolutional Gaussian processes and
report state-of-the-art results on CIFAR-10 among purely GP-based models.
Related papers
- Stochastic Gradient Descent for Gaussian Processes Done Right [86.83678041846971]
We show that when emphdone right -- by which we mean using specific insights from optimisation and kernel communities -- gradient descent is highly effective.
We introduce a emphstochastic dual descent algorithm, explain its design in an intuitive manner and illustrate the design choices.
Our method places Gaussian process regression on par with state-of-the-art graph neural networks for molecular binding affinity prediction.
arXiv Detail & Related papers (2023-10-31T16:15:13Z) - Variational sparse inverse Cholesky approximation for latent Gaussian
processes via double Kullback-Leibler minimization [6.012173616364571]
We combine a variational approximation of the posterior with a similar and efficient SIC-restricted Kullback-Leibler-optimal approximation of the prior.
For this setting, our variational approximation can be computed via gradient descent in polylogarithmic time per iteration.
We provide numerical comparisons showing that the proposed double-Kullback-Leibler-optimal Gaussian-process approximation (DKLGP) can sometimes be vastly more accurate for stationary kernels than alternative approaches.
arXiv Detail & Related papers (2023-01-30T21:50:08Z) - Gaussian Processes and Statistical Decision-making in Non-Euclidean
Spaces [96.53463532832939]
We develop techniques for broadening the applicability of Gaussian processes.
We introduce a wide class of efficient approximations built from this viewpoint.
We develop a collection of Gaussian process models over non-Euclidean spaces.
arXiv Detail & Related papers (2022-02-22T01:42:57Z) - Scalable Variational Gaussian Processes via Harmonic Kernel
Decomposition [54.07797071198249]
We introduce a new scalable variational Gaussian process approximation which provides a high fidelity approximation while retaining general applicability.
We demonstrate that, on a range of regression and classification problems, our approach can exploit input space symmetries such as translations and reflections.
Notably, our approach achieves state-of-the-art results on CIFAR-10 among pure GP models.
arXiv Detail & Related papers (2021-06-10T18:17:57Z) - Tighter Bounds on the Log Marginal Likelihood of Gaussian Process
Regression Using Conjugate Gradients [19.772149500352945]
We show that approximate maximum likelihood learning of model parameters by maximising our lower bound retains many of the sparse variational approach benefits.
In experiments, we show improved predictive performance with our model for a comparable amount of training time compared to other conjugate gradient based approaches.
arXiv Detail & Related papers (2021-02-16T17:54:59Z) - Pathwise Conditioning of Gaussian Processes [72.61885354624604]
Conventional approaches for simulating Gaussian process posteriors view samples as draws from marginal distributions of process values at finite sets of input locations.
This distribution-centric characterization leads to generative strategies that scale cubically in the size of the desired random vector.
We show how this pathwise interpretation of conditioning gives rise to a general family of approximations that lend themselves to efficiently sampling Gaussian process posteriors.
arXiv Detail & Related papers (2020-11-08T17:09:37Z) - Disentangling the Gauss-Newton Method and Approximate Inference for
Neural Networks [96.87076679064499]
We disentangle the generalized Gauss-Newton and approximate inference for Bayesian deep learning.
We find that the Gauss-Newton method simplifies the underlying probabilistic model significantly.
The connection to Gaussian processes enables new function-space inference algorithms.
arXiv Detail & Related papers (2020-07-21T17:42:58Z) - Mean-Field Approximation to Gaussian-Softmax Integral with Application
to Uncertainty Estimation [23.38076756988258]
We propose a new single-model based approach to quantify uncertainty in deep neural networks.
We use a mean-field approximation formula to compute an analytically intractable integral.
Empirically, the proposed approach performs competitively when compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-06-13T07:32:38Z) - Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems.
We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms.
Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.