Utilising Gradient-Based Proposals Within Sequential Monte Carlo Samplers for Training of Partial Bayesian Neural Networks
- URL: http://arxiv.org/abs/2505.03797v1
- Date: Thu, 01 May 2025 20:05:38 GMT
- Title: Utilising Gradient-Based Proposals Within Sequential Monte Carlo Samplers for Training of Partial Bayesian Neural Networks
- Authors: Andrew Millard, Joshua Murphy, Simon Maskell, Zheng Zhao,
- Abstract summary: Partial Bayesian neural networks (pBNNs) have been shown to perform competitively with fully Bayesian neural networks.<n>We introduce a new SMC-based training method for pBNNs by utilising a guided proposal and incorporating gradient-based Markov kernels.<n>We show that our new method outperforms the state-of-the-art in terms of predictive performance and optimal loss.
- Score: 3.2254941904559917
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Partial Bayesian neural networks (pBNNs) have been shown to perform competitively with fully Bayesian neural networks while only having a subset of the parameters be stochastic. Using sequential Monte Carlo (SMC) samplers as the inference method for pBNNs gives a non-parametric probabilistic estimation of the stochastic parameters, and has shown improved performance over parametric methods. In this paper we introduce a new SMC-based training method for pBNNs by utilising a guided proposal and incorporating gradient-based Markov kernels, which gives us better scalability on high dimensional problems. We show that our new method outperforms the state-of-the-art in terms of predictive performance and optimal loss. We also show that pBNNs scale well with larger batch sizes, resulting in significantly reduced training times and often better performance.
Related papers
- Parameter Expanded Stochastic Gradient Markov Chain Monte Carlo [32.46884330460211]
We propose a simple yet effective approach to enhance sample diversity in Gradient Markov Chain Monte Carlo.<n>This approach produces a more diverse set of samples, allowing faster mixing within the same computational budget.<n>Our experiments on image classification tasks, including OOD robustness, diversity, loss surface analyses, and a comparative study with Hamiltonian Monte Carlo, demonstrate the superiority of the proposed approach.
arXiv Detail & Related papers (2025-03-02T02:42:50Z) - Compact Bayesian Neural Networks via pruned MCMC sampling [0.16777183511743468]
Bayesian Neural Networks (BNNs) offer robust uncertainty quantification in model predictions, but training them presents a significant computational challenge.<n>In this study, we address some of the challenges by leveraging MCMC sampling with network pruning to obtain compact probabilistic models.<n>We ensure that the compact BNN retains its ability to estimate uncertainty via the posterior distribution while retaining the model training and generalisation performance accuracy by adapting post-pruning resampling.
arXiv Detail & Related papers (2025-01-12T22:48:04Z) - On Feynman--Kac training of partial Bayesian neural networks [1.6474447977095783]
Partial Bayesian neural networks (pBNNs) were shown to perform competitively with full Bayesian neural networks.
We propose an efficient sampling-based training strategy, wherein the training of a pBNN is formulated as simulating a Feynman--Kac model.
We show that our proposed training scheme outperforms the state of the art in terms of predictive performance.
arXiv Detail & Related papers (2023-10-30T15:03:15Z) - On the optimization and pruning for Bayesian deep learning [1.0152838128195467]
We propose a new adaptive variational Bayesian algorithm to train neural networks on weight space.
The EM-MCMC algorithm allows us to perform optimization and model pruning within one-shot.
Our dense model can reach the state-of-the-art performance and our sparse model perform very well compared to previously proposed pruning schemes.
arXiv Detail & Related papers (2022-10-24T05:18:08Z) - Data Subsampling for Bayesian Neural Networks [0.0]
Penalty Bayesian Neural Networks - PBNNs - are a new algorithm that allows the evaluation of the likelihood using subsampled batch data.
We show that PBNN achieves good predictive performance even for small mini-batch sizes of data.
arXiv Detail & Related papers (2022-10-17T14:43:35Z) - Wide Bayesian neural networks have a simple weight posterior: theory and
accelerated sampling [48.94555574632823]
Repriorisation transforms a Bayesian neural network (BNN) posterior to a distribution whose KL divergence to the BNN prior vanishes as layer widths grow.
We develop a Markov chain Monte Carlo (MCMC) posterior sampling algorithm which mixes faster the wider the BNN.
We observe up to 50x higher effective sample size relative to no reparametrisation for both fully-connected and residual networks.
arXiv Detail & Related papers (2022-06-15T17:11:08Z) - Comparative Analysis of Interval Reachability for Robust Implicit and
Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs)
INNs are a class of implicit learning models that use implicit equations as layers.
We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z) - What Are Bayesian Neural Network Posteriors Really Like? [63.950151520585024]
We show that Hamiltonian Monte Carlo can achieve significant performance gains over standard and deep ensembles.
We also show that deep distributions are similarly close to HMC as standard SGLD, and closer than standard variational inference.
arXiv Detail & Related papers (2021-04-29T15:38:46Z) - Sampling-free Variational Inference for Neural Networks with
Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference.
Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z) - Stochastic Markov Gradient Descent and Training Low-Bit Neural Networks [77.34726150561087]
We introduce Gradient Markov Descent (SMGD), a discrete optimization method applicable to training quantized neural networks.
We provide theoretical guarantees of algorithm performance as well as encouraging numerical results.
arXiv Detail & Related papers (2020-08-25T15:48:15Z) - Bayesian Graph Neural Networks with Adaptive Connection Sampling [62.51689735630133]
We propose a unified framework for adaptive connection sampling in graph neural networks (GNNs)
The proposed framework not only alleviates over-smoothing and over-fitting tendencies of deep GNNs, but also enables learning with uncertainty in graph analytic tasks with GNNs.
arXiv Detail & Related papers (2020-06-07T07:06:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.