On Feynman--Kac training of partial Bayesian neural networks
- URL: http://arxiv.org/abs/2310.19608v3
- Date: Tue, 27 Feb 2024 09:35:00 GMT
- Title: On Feynman--Kac training of partial Bayesian neural networks
- Authors: Zheng Zhao and Sebastian Mair and Thomas B. Sch\"on and Jens Sj\"olund
- Abstract summary: Partial Bayesian neural networks (pBNNs) were shown to perform competitively with full Bayesian neural networks.
We propose an efficient sampling-based training strategy, wherein the training of a pBNN is formulated as simulating a Feynman--Kac model.
We show that our proposed training scheme outperforms the state of the art in terms of predictive performance.
- Score: 1.6474447977095783
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, partial Bayesian neural networks (pBNNs), which only consider a
subset of the parameters to be stochastic, were shown to perform competitively
with full Bayesian neural networks. However, pBNNs are often multi-modal in the
latent variable space and thus challenging to approximate with parametric
models. To address this problem, we propose an efficient sampling-based
training strategy, wherein the training of a pBNN is formulated as simulating a
Feynman--Kac model. We then describe variations of sequential Monte Carlo
samplers that allow us to simultaneously estimate the parameters and the latent
posterior distribution of this model at a tractable computational cost. Using
various synthetic and real-world datasets we show that our proposed training
scheme outperforms the state of the art in terms of predictive performance.
Related papers
- Enhancing lattice kinetic schemes for fluid dynamics with Lattice-Equivariant Neural Networks [79.16635054977068]
We present a new class of equivariant neural networks, dubbed Lattice-Equivariant Neural Networks (LENNs)
Our approach develops within a recently introduced framework aimed at learning neural network-based surrogate models Lattice Boltzmann collision operators.
Our work opens towards practical utilization of machine learning-augmented Lattice Boltzmann CFD in real-world simulations.
arXiv Detail & Related papers (2024-05-22T17:23:15Z) - A variational neural Bayes framework for inference on intractable posterior distributions [1.0801976288811024]
Posterior distributions of model parameters are efficiently obtained by feeding observed data into a trained neural network.
We show theoretically that our posteriors converge to the true posteriors in Kullback-Leibler divergence.
arXiv Detail & Related papers (2024-04-16T20:40:15Z) - Expressive probabilistic sampling in recurrent neural networks [4.3900330990701235]
We show that firing rate dynamics of a recurrent neural circuit with a separate set of output units can sample from an arbitrary probability distribution.
We propose an efficient training procedure based on denoising score matching that finds recurrent and output weights such that the RSN implements Langevin sampling.
arXiv Detail & Related papers (2023-08-22T22:20:39Z) - BayesFlow: Amortized Bayesian Workflows With Neural Networks [0.0]
This manuscript introduces the Python library BayesFlow for simulation-based training of established neural network architectures for amortized data compression and inference.
Amortized Bayesian inference, as implemented in BayesFlow, enables users to train custom neural networks on model simulations and re-use these networks for any subsequent application of the models.
arXiv Detail & Related papers (2023-06-28T08:41:49Z) - Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters.
We find that our approach successfully generates parameters for a wide range of loss prompts.
We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z) - Variational Neural Networks [88.24021148516319]
We propose a method for uncertainty estimation in neural networks called Variational Neural Network (VNN)
VNN generates parameters for the output distribution of a layer by transforming its inputs with learnable sub-layers.
In uncertainty quality estimation experiments, we show that VNNs achieve better uncertainty quality than Monte Carlo Dropout or Bayes By Backpropagation methods.
arXiv Detail & Related papers (2022-07-04T15:41:02Z) - Kalman Bayesian Neural Networks for Closed-form Online Learning [5.220940151628734]
We propose a novel approach for BNN learning via closed-form Bayesian inference.
The calculation of the predictive distribution of the output and the update of the weight distribution are treated as Bayesian filtering and smoothing problems.
This allows closed-form expressions for training the network's parameters in a sequential/online fashion without gradient descent.
arXiv Detail & Related papers (2021-10-03T07:29:57Z) - Regularized Sequential Latent Variable Models with Adversarial Neural
Networks [33.74611654607262]
We will present different ways of using high level latent random variables in RNN to model the variability in the sequential data.
We will explore possible ways of using adversarial method to train a variational RNN model.
arXiv Detail & Related papers (2021-08-10T08:05:14Z) - Sampling-free Variational Inference for Neural Networks with
Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference.
Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z) - A Bayesian Perspective on Training Speed and Model Selection [51.15664724311443]
We show that a measure of a model's training speed can be used to estimate its marginal likelihood.
We verify our results in model selection tasks for linear models and for the infinite-width limit of deep neural networks.
Our results suggest a promising new direction towards explaining why neural networks trained with gradient descent are biased towards functions that generalize well.
arXiv Detail & Related papers (2020-10-27T17:56:14Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.