Bayesian Neural Network Inference via Implicit Models and the Posterior
Predictive Distribution
- URL: http://arxiv.org/abs/2209.02188v1
- Date: Tue, 6 Sep 2022 02:43:19 GMT
- Title: Bayesian Neural Network Inference via Implicit Models and the Posterior
Predictive Distribution
- Authors: Joel Janek Dabrowski, Daniel Edward Pagendam
- Abstract summary: We propose a novel approach to perform approximate Bayesian inference in complex models such as Bayesian neural networks.
The approach is more scalable to large data than Markov Chain Monte Carlo.
We see this being useful in applications such as surrogate and physics-based models.
- Score: 0.8122270502556371
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a novel approach to perform approximate Bayesian inference in
complex models such as Bayesian neural networks. The approach is more scalable
to large data than Markov Chain Monte Carlo, it embraces more expressive models
than Variational Inference, and it does not rely on adversarial training (or
density ratio estimation). We adopt the recent approach of constructing two
models: (1) a primary model, tasked with performing regression or
classification; and (2) a secondary, expressive (e.g. implicit) model that
defines an approximate posterior distribution over the parameters of the
primary model. However, we optimise the parameters of the posterior model via
gradient descent according to a Monte Carlo estimate of the posterior
predictive distribution -- which is our only approximation (other than the
posterior model). Only a likelihood needs to be specified, which can take
various forms such as loss functions and synthetic likelihoods, thus providing
a form of a likelihood-free approach. Furthermore, we formulate the approach
such that the posterior samples can either be independent of, or conditionally
dependent upon the inputs to the primary model. The latter approach is shown to
be capable of increasing the apparent complexity of the primary model. We see
this being useful in applications such as surrogate and physics-based models.
To promote how the Bayesian paradigm offers more than just uncertainty
quantification, we demonstrate: uncertainty quantification, multi-modality, as
well as an application with a recent deep forecasting neural network
architecture.
Related papers
- Bayesian Inverse Graphics for Few-Shot Concept Learning [3.475273727432576]
We present a Bayesian model of perception that learns using only minimal data.
We show how this representation can be used for downstream tasks such as few-shot classification and estimation.
arXiv Detail & Related papers (2024-09-12T18:30:41Z) - von Mises Quasi-Processes for Bayesian Circular Regression [57.88921637944379]
We explore a family of expressive and interpretable distributions over circle-valued random functions.
The resulting probability model has connections with continuous spin models in statistical physics.
For posterior inference, we introduce a new Stratonovich-like augmentation that lends itself to fast Markov Chain Monte Carlo sampling.
arXiv Detail & Related papers (2024-06-19T01:57:21Z) - A variational neural Bayes framework for inference on intractable posterior distributions [1.0801976288811024]
Posterior distributions of model parameters are efficiently obtained by feeding observed data into a trained neural network.
We show theoretically that our posteriors converge to the true posteriors in Kullback-Leibler divergence.
arXiv Detail & Related papers (2024-04-16T20:40:15Z) - Towards Model-Agnostic Posterior Approximation for Fast and Accurate Variational Autoencoders [22.77397537980102]
We show that we can compute a deterministic, model-agnostic posterior approximation (MAPA) of the true model's posterior.
We present preliminary results on low-dimensional synthetic data that (1) MAPA captures the trend of the true posterior, and (2) our MAPA-based inference performs better density estimation with less computation than baselines.
arXiv Detail & Related papers (2024-03-13T20:16:21Z) - Structured Radial Basis Function Network: Modelling Diversity for
Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions.
A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems.
It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z) - Model Comparison in Approximate Bayesian Computation [0.456877715768796]
A common problem in natural sciences is the comparison of competing models in the light of observed data.
This framework relies on the calculation of likelihood functions which are intractable for most models used in practice.
I propose a new efficient method to perform Bayesian model comparison in ABC.
arXiv Detail & Related papers (2022-03-15T10:24:16Z) - Probabilistic Modeling for Human Mesh Recovery [73.11532990173441]
This paper focuses on the problem of 3D human reconstruction from 2D evidence.
We recast the problem as learning a mapping from the input to a distribution of plausible 3D poses.
arXiv Detail & Related papers (2021-08-26T17:55:11Z) - MINIMALIST: Mutual INformatIon Maximization for Amortized Likelihood
Inference from Sampled Trajectories [61.3299263929289]
Simulation-based inference enables learning the parameters of a model even when its likelihood cannot be computed in practice.
One class of methods uses data simulated with different parameters to infer an amortized estimator for the likelihood-to-evidence ratio.
We show that this approach can be formulated in terms of mutual information between model parameters and simulated data.
arXiv Detail & Related papers (2021-06-03T12:59:16Z) - A Bayesian Perspective on Training Speed and Model Selection [51.15664724311443]
We show that a measure of a model's training speed can be used to estimate its marginal likelihood.
We verify our results in model selection tasks for linear models and for the infinite-width limit of deep neural networks.
Our results suggest a promising new direction towards explaining why neural networks trained with gradient descent are biased towards functions that generalize well.
arXiv Detail & Related papers (2020-10-27T17:56:14Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z) - Mean-Field Approximation to Gaussian-Softmax Integral with Application
to Uncertainty Estimation [23.38076756988258]
We propose a new single-model based approach to quantify uncertainty in deep neural networks.
We use a mean-field approximation formula to compute an analytically intractable integral.
Empirically, the proposed approach performs competitively when compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-06-13T07:32:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.