$\pi$VAE: a stochastic process prior for Bayesian deep learning with
MCMC
- URL: http://arxiv.org/abs/2002.06873v6
- Date: Tue, 13 Sep 2022 19:02:14 GMT
- Title: $\pi$VAE: a stochastic process prior for Bayesian deep learning with
MCMC
- Authors: Swapnil Mishra, Seth Flaxman, Tresnia Berah, Harrison Zhu, Mikko
Pakkanen, Samir Bhatt
- Abstract summary: We propose a novel variational autoencoder called the prior encodingal autoencoder ($pi$VAE)
We show that our framework can accurately learn expressive function classes such as Gaussian processes, but also properties of functions to enable statistical inference.
Perhaps most usefully, we demonstrate that the low dimensional distributed latent space representation learnt provides an elegant and scalable means of performing inference for processes within programming languages such as Stan.
- Score: 2.4792948967354236
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Stochastic processes provide a mathematically elegant way model complex data.
In theory, they provide flexible priors over function classes that can encode a
wide range of interesting assumptions. In practice, however, efficient
inference by optimisation or marginalisation is difficult, a problem further
exacerbated with big data and high dimensional input spaces. We propose a novel
variational autoencoder (VAE) called the prior encoding variational autoencoder
($\pi$VAE). The $\pi$VAE is finitely exchangeable and Kolmogorov consistent,
and thus is a continuous stochastic process. We use $\pi$VAE to learn low
dimensional embeddings of function classes. We show that our framework can
accurately learn expressive function classes such as Gaussian processes, but
also properties of functions to enable statistical inference (such as the
integral of a log Gaussian process). For popular tasks, such as spatial
interpolation, $\pi$VAE achieves state-of-the-art performance both in terms of
accuracy and computational efficiency. Perhaps most usefully, we demonstrate
that the low dimensional independently distributed latent space representation
learnt provides an elegant and scalable means of performing Bayesian inference
for stochastic processes within probabilistic programming languages such as
Stan.
Related papers
- Computation-Aware Gaussian Processes: Model Selection And Linear-Time Inference [55.150117654242706]
We show that model selection for computation-aware GPs trained on 1.8 million data points can be done within a few hours on a single GPU.
As a result of this work, Gaussian processes can be trained on large-scale datasets without significantly compromising their ability to quantify uncertainty.
arXiv Detail & Related papers (2024-11-01T21:11:48Z) - Tractable and Provably Efficient Distributional Reinforcement Learning with General Value Function Approximation [8.378137704007038]
We present a regret analysis for distributional reinforcement learning with general value function approximation.
Our theoretical results show that approximating the infinite-dimensional return distribution with a finite number of moment functionals is the only method to learn the statistical information unbiasedly.
arXiv Detail & Related papers (2024-07-31T00:43:51Z) - Stochastic Q-learning for Large Discrete Action Spaces [79.1700188160944]
In complex environments with discrete action spaces, effective decision-making is critical in reinforcement learning (RL)
We present value-based RL approaches which, as opposed to optimizing over the entire set of $n$ actions, only consider a variable set of actions, possibly as small as $mathcalO(log(n)$)$.
The presented value-based RL methods include, among others, Q-learning, StochDQN, StochDDQN, all of which integrate this approach for both value-function updates and action selection.
arXiv Detail & Related papers (2024-05-16T17:58:44Z) - Online non-parametric likelihood-ratio estimation by Pearson-divergence
functional minimization [55.98760097296213]
We introduce a new framework for online non-parametric LRE (OLRE) for the setting where pairs of iid observations $(x_t sim p, x'_t sim q)$ are observed over time.
We provide theoretical guarantees for the performance of the OLRE method along with empirical validation in synthetic experiments.
arXiv Detail & Related papers (2023-11-03T13:20:11Z) - Equation Discovery with Bayesian Spike-and-Slab Priors and Efficient Kernels [57.46832672991433]
We propose a novel equation discovery method based on Kernel learning and BAyesian Spike-and-Slab priors (KBASS)
We use kernel regression to estimate the target function, which is flexible, expressive, and more robust to data sparsity and noises.
We develop an expectation-propagation expectation-maximization algorithm for efficient posterior inference and function estimation.
arXiv Detail & Related papers (2023-10-09T03:55:09Z) - Integrated Variational Fourier Features for Fast Spatial Modelling with Gaussian Processes [7.5991638205413325]
For $N$ training points, exact inference has $O(N3)$ cost; with $M ll N$ features, state of the art sparse variational methods have $O(NM2)$ cost.
Recently, methods have been proposed using more sophisticated features; these promise $O(M3)$ cost, with good performance in low dimensional tasks such as spatial modelling, but they only work with a very limited class of kernels, excluding some of the most commonly used.
In this work, we propose integrated Fourier features, which extends these performance benefits to a very broad class of stationary co
arXiv Detail & Related papers (2023-08-27T15:44:28Z) - Exact Bayesian Inference on Discrete Models via Probability Generating
Functions: A Probabilistic Programming Approach [7.059472280274009]
We present an exact Bayesian inference method for discrete statistical models.
We use a probabilistic programming language that supports discrete and continuous sampling, discrete observations, affine functions, (stochastic) branching, and conditioning on discrete events.
Our inference method is provably correct and fully automated.
arXiv Detail & Related papers (2023-05-26T16:09:59Z) - Generalized Differentiable RANSAC [95.95627475224231]
$nabla$-RANSAC is a differentiable RANSAC that allows learning the entire randomized robust estimation pipeline.
$nabla$-RANSAC is superior to the state-of-the-art in terms of accuracy while running at a similar speed to its less accurate alternatives.
arXiv Detail & Related papers (2022-12-26T15:13:13Z) - Stochastic Inexact Augmented Lagrangian Method for Nonconvex Expectation
Constrained Optimization [88.0031283949404]
Many real-world problems have complicated non functional constraints and use a large number of data points.
Our proposed method outperforms an existing method with the previously best-known result.
arXiv Detail & Related papers (2022-12-19T14:48:54Z) - Bayesian Learning via Q-Exponential Process [10.551294837978363]
Regularization is one of the most fundamental topics in optimization, statistics and machine learning.
In this work, we generalize the $q$-exponential distribution (with density proportional to) $exp( frac12|u|q)$ to a process named $Q$-exponential (Q-EP) process that corresponds to the $L_q$ regularization of functions.
arXiv Detail & Related papers (2022-10-14T17:37:14Z) - Quadruply Stochastic Gaussian Processes [10.152838128195466]
We introduce a variational inference procedure for training scalable Gaussian process (GP) models whose per-iteration complexity is independent of both the number of training points, $n$, and the number basis functions used in the kernel approximation, $m$.
We demonstrate accurate inference on large classification and regression datasets using GPs and relevance vector machines with up to $m = 107$ basis functions.
arXiv Detail & Related papers (2020-06-04T17:06:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.