An Improved Variational Approximate Posterior for the Deep Wishart
Process
- URL: http://arxiv.org/abs/2305.14454v1
- Date: Tue, 23 May 2023 18:26:29 GMT
- Title: An Improved Variational Approximate Posterior for the Deep Wishart
Process
- Authors: Sebastian Ober, Ben Anson, Edward Milsom and Laurence Aitchison
- Abstract summary: Deep kernel processes are a recently introduced class of deep Bayesian models.
They operate by sampling a Gram matrix from a distribution over positive semi-definite matrices.
We show that further generalising their distribution to allow linear combinations of rows and columns results in better predictive performance.
- Score: 24.442174952832108
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep kernel processes are a recently introduced class of deep Bayesian models
that have the flexibility of neural networks, but work entirely with Gram
matrices. They operate by alternately sampling a Gram matrix from a
distribution over positive semi-definite matrices, and applying a deterministic
transformation. When the distribution is chosen to be Wishart, the model is
called a deep Wishart process (DWP). This particular model is of interest
because its prior is equivalent to a deep Gaussian process (DGP) prior, but at
the same time it is invariant to rotational symmetries, leading to a simpler
posterior distribution. Practical inference in the DWP was made possible in
recent work ("A variational approximate posterior for the deep Wishart process"
Ober and Aitchison 2021a) where the authors used a generalisation of the
Bartlett decomposition of the Wishart distribution as the variational
approximate posterior. However, predictive performance in that paper was less
impressive than one might expect, with the DWP only beating a DGP on a few of
the UCI datasets used for comparison. In this paper, we show that further
generalising their distribution to allow linear combinations of rows and
columns in the Bartlett decomposition results in better predictive performance,
while incurring negligible additional computation cost.
Related papers
- Sparse Gaussian Processes: Structured Approximations and Power-EP Revisited [9.83722115577313]
Inducing-point-based sparse variational Gaussian processes have become the standard workhorse for scaling up GP models.<n>Recent advances show that these methods can be improved by introducing a diagonal scaling matrix to the conditional posterior density.<n>This paper first considers an extension that employs a block-diagonal structure for the scaling matrix, provably tightening the variational lower bound.
arXiv Detail & Related papers (2025-07-03T07:18:54Z) - Variational Learning of Gaussian Process Latent Variable Models through Stochastic Gradient Annealed Importance Sampling [22.256068524699472]
In this work, we propose an Annealed Importance Sampling (AIS) approach to address these issues.
We combine the strengths of Sequential Monte Carlo samplers and VI to explore a wider range of posterior distributions and gradually approach the target distribution.
Experimental results on both toy and image datasets demonstrate that our method outperforms state-of-the-art methods in terms of tighter variational bounds, higher log-likelihoods, and more robust convergence.
arXiv Detail & Related papers (2024-08-13T08:09:05Z) - von Mises Quasi-Processes for Bayesian Circular Regression [57.88921637944379]
We explore a family of expressive and interpretable distributions over circle-valued random functions.
The resulting probability model has connections with continuous spin models in statistical physics.
For posterior inference, we introduce a new Stratonovich-like augmentation that lends itself to fast Markov Chain Monte Carlo sampling.
arXiv Detail & Related papers (2024-06-19T01:57:21Z) - Stochastic Gradient Descent for Gaussian Processes Done Right [86.83678041846971]
We show that when emphdone right -- by which we mean using specific insights from optimisation and kernel communities -- gradient descent is highly effective.
We introduce a emphstochastic dual descent algorithm, explain its design in an intuitive manner and illustrate the design choices.
Our method places Gaussian process regression on par with state-of-the-art graph neural networks for molecular binding affinity prediction.
arXiv Detail & Related papers (2023-10-31T16:15:13Z) - Implicit Manifold Gaussian Process Regression [49.0787777751317]
Gaussian process regression is widely used to provide well-calibrated uncertainty estimates.
It struggles with high-dimensional data because of the implicit low-dimensional manifold upon which the data actually lies.
In this paper we propose a technique capable of inferring implicit structure directly from data (labeled and unlabeled) in a fully differentiable way.
arXiv Detail & Related papers (2023-10-30T09:52:48Z) - Deep Transformed Gaussian Processes [0.0]
Transformed Gaussian Processes (TGPs) are processes specified by transforming samples from the joint distribution from a prior process (typically a GP) using an invertible transformation.
We propose a generalization of TGPs named Deep Transformed Gaussian Processes (DTGPs), which follows the trend of concatenating layers of processes.
Experiments conducted evaluate the proposed DTGPs in multiple regression datasets, achieving good scalability and performance.
arXiv Detail & Related papers (2023-10-27T16:09:39Z) - Variational Laplace Autoencoders [53.08170674326728]
Variational autoencoders employ an amortized inference model to approximate the posterior of latent variables.
We present a novel approach that addresses the limited posterior expressiveness of fully-factorized Gaussian assumption.
We also present a general framework named Variational Laplace Autoencoders (VLAEs) for training deep generative models.
arXiv Detail & Related papers (2022-11-30T18:59:27Z) - Scalable Bayesian Transformed Gaussian Processes [10.33253403416662]
The Bayesian transformed Gaussian process (BTG) model is a fully Bayesian counterpart to the warped Gaussian process (WGP)
We propose principled and fast techniques for computing with BTG.
Our framework uses doubly sparse quadrature rules, tight quantile bounds, and rank-one matrix algebra to enable both fast model prediction and model selection.
arXiv Detail & Related papers (2022-10-20T02:45:10Z) - Non-Gaussian Gaussian Processes for Few-Shot Regression [71.33730039795921]
We propose an invertible ODE-based mapping that operates on each component of the random variable vectors and shares the parameters across all of them.
NGGPs outperform the competing state-of-the-art approaches on a diversified set of benchmarks and applications.
arXiv Detail & Related papers (2021-10-26T10:45:25Z) - A variational approximate posterior for the deep Wishart process [23.786649328915093]
Recent work introduced deep kernel processes as an entirely kernel-based alternative to NNs.
We give a novel approach to obtaining flexible distributions over positive semi-definite matrices.
We show that inference in the deep Wishart process gives improved performance over doing inference in a DGP with the equivalent prior.
arXiv Detail & Related papers (2021-07-21T14:48:27Z) - What Are Bayesian Neural Network Posteriors Really Like? [63.950151520585024]
We show that Hamiltonian Monte Carlo can achieve significant performance gains over standard and deep ensembles.
We also show that deep distributions are similarly close to HMC as standard SGLD, and closer than standard variational inference.
arXiv Detail & Related papers (2021-04-29T15:38:46Z) - Convolutional Normalizing Flows for Deep Gaussian Processes [40.10797051603641]
This paper introduces a new approach for specifying flexible, arbitrarily complex, and scalable approximate posterior distributions.
A novel convolutional normalizing flow (CNF) is developed to improve the time efficiency and capture dependency between layers.
Empirical evaluation demonstrates that CNF DGP outperforms the state-of-the-art approximation methods for DGPs.
arXiv Detail & Related papers (2021-04-17T07:25:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.