Related papers: Data Subsampling for Bayesian Neural Networks

Data Subsampling for Bayesian Neural Networks

URL: http://arxiv.org/abs/2210.09141v1
Date: Mon, 17 Oct 2022 14:43:35 GMT
Title: Data Subsampling for Bayesian Neural Networks
Authors: Eiji Kawasaki, Markus Holzmann
Abstract summary: Penalty Bayesian Neural Networks - PBNNs - achieve good predictive performance for a given mini-batch size. Varying the size of the mini-batches enables a natural calibration of the predictive distribution. We expect PBNN to be particularly suited for cases when data sets are distributed across multiple decentralized devices.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Markov Chain Monte Carlo (MCMC) algorithms do not scale well for large datasets leading to difficulties in Neural Network posterior sampling. In this paper, we apply a generalization of the Metropolis Hastings algorithm that allows us to restrict the evaluation of the likelihood to small mini-batches in a Bayesian inference context. Since it requires the computation of a so-called "noise penalty" determined by the variance of the training loss function over the mini-batches, we refer to this data subsampling strategy as Penalty Bayesian Neural Networks - PBNNs. Its implementation on top of MCMC is straightforward, as the variance of the loss function merely reduces the acceptance probability. Comparing to other samplers, we empirically show that PBNN achieves good predictive performance for a given mini-batch size. Varying the size of the mini-batches enables a natural calibration of the predictive distribution and provides an inbuilt protection against overfitting. We expect PBNN to be particularly suited for cases when data sets are distributed across multiple decentralized devices as typical in federated learning.

Related papers

Utilising Gradient-Based Proposals Within Sequential Monte Carlo Samplers for Training of Partial Bayesian Neural Networks [3.2254941904559917]
Partial Bayesian neural networks (pBNNs) have been shown to perform competitively with fully Bayesian neural networks.<n>We introduce a new SMC-based training method for pBNNs by utilising a guided proposal and incorporating gradient-based Markov kernels.<n>We show that our new method outperforms the state-of-the-art in terms of predictive performance and optimal loss.
arXiv Detail & Related papers (2025-05-01T20:05:38Z)
Compact Bayesian Neural Networks via pruned MCMC sampling [0.16777183511743468]
Bayesian Neural Networks (BNNs) offer robust uncertainty quantification in model predictions, but training them presents a significant computational challenge. In this study, we address some of the challenges by leveraging MCMC sampling with network pruning to obtain compact probabilistic models. We ensure that the compact BNN retains its ability to estimate uncertainty via the posterior distribution while retaining the model training and generalisation performance accuracy by adapting post-pruning resampling.
arXiv Detail & Related papers (2025-01-12T22:48:04Z)
Sampling from Bayesian Neural Network Posteriors with Symmetric Minibatch Splitting Langevin Dynamics [0.8749675983608172]
We propose a scalable kinetic Langevin dynamics algorithm for sampling parameter spaces of big data and AI applications. We show that the resulting Symmetric Minibatch Splitting-UBU (SMS-UBU) integrator has bias $O(h2 d1/2)$ in dimension $d>0$ with stepsize $h>0$. We apply the algorithm to explore local modes of the posterior distribution of Bayesian neural networks (BNNs) and evaluate the calibration performance of the posterior predictive probabilities for neural networks with convolutional neural network architectures.
arXiv Detail & Related papers (2024-10-14T13:47:02Z)
Single-shot Bayesian approximation for neural networks [0.0]
Deep neural networks (NNs) are known for their high-prediction performances. NNs are prone to yield unreliable predictions when encountering completely new situations without indicating their uncertainty. We present a single-shot MC dropout approximation that preserves the advantages of BNNs while being as fast as NNs.
arXiv Detail & Related papers (2023-08-24T13:40:36Z)
Collapsed Inference for Bayesian Deep Learning [36.1725075097107]
We introduce a novel collapsed inference scheme that performs Bayesian model averaging using collapsed samples. A collapsed sample represents uncountably many models drawn from the approximate posterior. Our proposed use of collapsed samples achieves a balance between scalability and accuracy.
arXiv Detail & Related papers (2023-06-16T08:34:42Z)
Transformers Can Do Bayesian Inference [56.99390658880008]
We present Prior-Data Fitted Networks (PFNs) PFNs leverage in-context learning in large-scale machine learning techniques to approximate a large set of posteriors. We demonstrate that PFNs can near-perfectly mimic Gaussian processes and also enable efficient Bayesian inference for intractable problems.
arXiv Detail & Related papers (2021-12-20T13:07:39Z)
Kalman Bayesian Neural Networks for Closed-form Online Learning [5.220940151628734]
We propose a novel approach for BNN learning via closed-form Bayesian inference. The calculation of the predictive distribution of the output and the update of the weight distribution are treated as Bayesian filtering and smoothing problems. This allows closed-form expressions for training the network's parameters in a sequential/online fashion without gradient descent.
arXiv Detail & Related papers (2021-10-03T07:29:57Z)
Rapid Risk Minimization with Bayesian Models Through Deep Learning Approximation [9.93116974480156]
We introduce a novel combination of Bayesian Models (BMs) and Neural Networks (NNs) for making predictions with a minimum expected risk. Our approach combines the data efficiency and interpretability of a BM with the speed of a NN. We achieve risk minimized predictions significantly faster than standard methods with a negligible loss on the testing dataset.
arXiv Detail & Related papers (2021-03-29T15:08:25Z)
Sampling-free Variational Inference for Neural Networks with Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference. Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z)
A Biased Graph Neural Network Sampler with Near-Optimal Regret [57.70126763759996]
Graph neural networks (GNN) have emerged as a vehicle for applying deep network architectures to graph and relational data. In this paper, we build upon existing work and treat GNN neighbor sampling as a multi-armed bandit problem. We introduce a newly-designed reward function that introduces some degree of bias designed to reduce variance and avoid unstable, possibly-unbounded payouts.
arXiv Detail & Related papers (2021-03-01T15:55:58Z)
Unlabelled Data Improves Bayesian Uncertainty Calibration under Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation. We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
Bandit Samplers for Training Graph Neural Networks [63.17765191700203]
Several sampling algorithms with variance reduction have been proposed for accelerating the training of Graph Convolution Networks (GCNs) These sampling algorithms are not applicable to more general graph neural networks (GNNs) where the message aggregator contains learned weights rather than fixed weights, such as Graph Attention Networks (GAT)
arXiv Detail & Related papers (2020-06-10T12:48:37Z)
Uncertainty Estimation Using a Single Deep Deterministic Neural Network [66.26231423824089]
We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass. We scale training in these with a novel loss function and centroid updating scheme and match the accuracy of softmax models.
arXiv Detail & Related papers (2020-03-04T12:27:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.