Related papers: Sampling from Bayesian Neural Network Posteriors with Symmetric Minibatch Splitting Langevin Dynamics

Sampling from Bayesian Neural Network Posteriors with Symmetric Minibatch Splitting Langevin Dynamics

URL: http://arxiv.org/abs/2410.19780v1
Date: Mon, 14 Oct 2024 13:47:02 GMT
Title: Sampling from Bayesian Neural Network Posteriors with Symmetric Minibatch Splitting Langevin Dynamics
Authors: Daniel Paulin, Peter A. Whalley, Neil K. Chada, Benedict Leimkuhler,
Abstract summary: We propose a scalable kinetic Langevin dynamics algorithm for sampling parameter spaces of big data and AI applications. We show that the resulting Symmetric Minibatch Splitting-UBU (SMS-UBU) integrator has bias $O(h2 d1/2)$ in dimension $d>0$ with stepsize $h>0$. We apply the algorithm to explore local modes of the posterior distribution of Bayesian neural networks (BNNs) and evaluate the calibration performance of the posterior predictive probabilities for neural networks with convolutional neural network architectures.
Score: 0.8749675983608172
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose a scalable kinetic Langevin dynamics algorithm for sampling parameter spaces of big data and AI applications. Our scheme combines a symmetric forward/backward sweep over minibatches with a symmetric discretization of Langevin dynamics. For a particular Langevin splitting method (UBU), we show that the resulting Symmetric Minibatch Splitting-UBU (SMS-UBU) integrator has bias $O(h^2 d^{1/2})$ in dimension $d>0$ with stepsize $h>0$, despite only using one minibatch per iteration, thus providing excellent control of the sampling bias as a function of the stepsize. We apply the algorithm to explore local modes of the posterior distribution of Bayesian neural networks (BNNs) and evaluate the calibration performance of the posterior predictive probabilities for neural networks with convolutional neural network architectures for classification problems on three different datasets (Fashion-MNIST, Celeb-A and chest X-ray). Our results indicate that BNNs sampled with SMS-UBU can offer significantly better calibration performance compared to standard methods of training and stochastic weight averaging.

Related papers

Utilising Gradient-Based Proposals Within Sequential Monte Carlo Samplers for Training of Partial Bayesian Neural Networks [3.2254941904559917]
Partial Bayesian neural networks (pBNNs) have been shown to perform competitively with fully Bayesian neural networks.<n>We introduce a new SMC-based training method for pBNNs by utilising a guided proposal and incorporating gradient-based Markov kernels.<n>We show that our new method outperforms the state-of-the-art in terms of predictive performance and optimal loss.
arXiv Detail & Related papers (2025-05-01T20:05:38Z)
Chaos into Order: Neural Framework for Expected Value Estimation of Stochastic Partial Differential Equations [0.9944647907864256]
We propose a physics-informed neural framework designed to approximate the expected value of linear partial differential equations (SPDEs)<n>By leveraging randomized sampling of both space-time coordinates and noise realizations during training, LEC trains standard feedforward neural networks to minimize residual loss across multiple samples.<n>We show that the model consistently learns accurate approximations of the expected value of the solution in lower dimensions and a predictable decrease in robustness with increased spatial dimensions.
arXiv Detail & Related papers (2025-02-05T23:27:28Z)
Compact Bayesian Neural Networks via pruned MCMC sampling [0.16777183511743468]
Bayesian Neural Networks (BNNs) offer robust uncertainty quantification in model predictions, but training them presents a significant computational challenge. In this study, we address some of the challenges by leveraging MCMC sampling with network pruning to obtain compact probabilistic models. We ensure that the compact BNN retains its ability to estimate uncertainty via the posterior distribution while retaining the model training and generalisation performance accuracy by adapting post-pruning resampling.
arXiv Detail & Related papers (2025-01-12T22:48:04Z)
Parallel simulation for sampling under isoperimetry and score-based diffusion models [56.39904484784127]
As data size grows, reducing the iteration cost becomes an important goal. Inspired by the success of the parallel simulation of the initial value problem in scientific computation, we propose parallel Picard methods for sampling tasks. Our work highlights the potential advantages of simulation methods in scientific computation for dynamics-based sampling and diffusion models.
arXiv Detail & Related papers (2024-12-10T11:50:46Z)
Enhancing lattice kinetic schemes for fluid dynamics with Lattice-Equivariant Neural Networks [79.16635054977068]
We present a new class of equivariant neural networks, dubbed Lattice-Equivariant Neural Networks (LENNs) Our approach develops within a recently introduced framework aimed at learning neural network-based surrogate models Lattice Boltzmann collision operators. Our work opens towards practical utilization of machine learning-augmented Lattice Boltzmann CFD in real-world simulations.
arXiv Detail & Related papers (2024-05-22T17:23:15Z)
On Feynman--Kac training of partial Bayesian neural networks [1.6474447977095783]
Partial Bayesian neural networks (pBNNs) were shown to perform competitively with full Bayesian neural networks. We propose an efficient sampling-based training strategy, wherein the training of a pBNN is formulated as simulating a Feynman--Kac model. We show that our proposed training scheme outperforms the state of the art in terms of predictive performance.
arXiv Detail & Related papers (2023-10-30T15:03:15Z)
Data Subsampling for Bayesian Neural Networks [0.0]
Penalty Bayesian Neural Networks - PBNNs - are a new algorithm that allows the evaluation of the likelihood using subsampled batch data. We show that PBNN achieves good predictive performance even for small mini-batch sizes of data.
arXiv Detail & Related papers (2022-10-17T14:43:35Z)
Bayesian Neural Network Language Modeling for Speech Recognition [59.681758762712754]
State-of-the-art neural network language models (NNLMs) represented by long short term memory recurrent neural networks (LSTM-RNNs) and Transformers are becoming highly complex. In this paper, an overarching full Bayesian learning framework is proposed to account for the underlying uncertainty in LSTM-RNN and Transformer LMs.
arXiv Detail & Related papers (2022-08-28T17:50:19Z)
Wide Bayesian neural networks have a simple weight posterior: theory and accelerated sampling [48.94555574632823]
Repriorisation transforms a Bayesian neural network (BNN) posterior to a distribution whose KL divergence to the BNN prior vanishes as layer widths grow. We develop a Markov chain Monte Carlo (MCMC) posterior sampling algorithm which mixes faster the wider the BNN. We observe up to 50x higher effective sample size relative to no reparametrisation for both fully-connected and residual networks.
arXiv Detail & Related papers (2022-06-15T17:11:08Z)
A PAC-Bayes oracle inequality for sparse neural networks [0.0]
We study the Gibbs posterior distribution for sparse deep neural nets in a nonparametric regression setting. We prove an oracle inequality which shows that the method adapts to the unknown regularity and hierarchical structure of the regression function.
arXiv Detail & Related papers (2022-04-26T15:48:24Z)
Encoding the latent posterior of Bayesian Neural Networks for uncertainty quantification [10.727102755903616]
We aim for efficient deep BNNs amenable to complex computer vision architectures. We achieve this by leveraging variational autoencoders (VAEs) to learn the interaction and the latent distribution of the parameters at each network layer. Our approach, Latent-Posterior BNN (LP-BNN), is compatible with the recent BatchEnsemble method, leading to highly efficient (in terms of computation and memory during both training and testing) ensembles.
arXiv Detail & Related papers (2020-12-04T19:50:09Z)
Deep Networks for Direction-of-Arrival Estimation in Low SNR [89.45026632977456]
We introduce a Convolutional Neural Network (CNN) that is trained from mutli-channel data of the true array manifold matrix. We train a CNN in the low-SNR regime to predict DoAs across all SNRs. Our robust solution can be applied in several fields, ranging from wireless array sensors to acoustic microphones or sonars.
arXiv Detail & Related papers (2020-11-17T12:52:18Z)
Neural Control Variates [71.42768823631918]
We show that a set of neural networks can face the challenge of finding a good approximation of the integrand. We derive a theoretically optimal, variance-minimizing loss function, and propose an alternative, composite loss for stable online training in practice. Specifically, we show that the learned light-field approximation is of sufficient quality for high-order bounces, allowing us to omit the error correction and thereby dramatically reduce the noise at the cost of negligible visible bias.
arXiv Detail & Related papers (2020-06-02T11:17:55Z)
Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network. Our model requires a much less number of communication rounds and still a number of communication rounds in theory. Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.