Related papers: Large-scale Score-based Variational Posterior Inference for Bayesian Deep Neural Networks

Large-scale Score-based Variational Posterior Inference for Bayesian Deep Neural Networks

URL: http://arxiv.org/abs/2602.05873v1
Date: Thu, 05 Feb 2026 16:51:07 GMT
Title: Large-scale Score-based Variational Posterior Inference for Bayesian Deep Neural Networks
Authors: Minyoung Kim,
Abstract summary: We introduce a score-based alternative for BNN variational inference.<n>We propose a novel scalable VI method where the learning objective combines score matching loss and the proximal penalty term in iterations.<n>This in turn makes our method scalable to large-scale neural networks including Vision Transformers, and allows for richer variational density families.
Score: 7.945670209718052
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Bayesian (deep) neural networks (BNN) are often more attractive than the mainstream point-estimate vanilla deep learning in various aspects including uncertainty quantification, robustness to noise, resistance to overfitting, and more. The variational inference (VI) is one of the most widely adopted approximate inference methods. Whereas the ELBO-based variational free energy method is a dominant choice in the literature, in this paper we introduce a score-based alternative for BNN variational inference. Although there have been quite a few score-based variational inference methods proposed in the community, most are not adequate for large-scale BNNs for various computational and technical reasons. We propose a novel scalable VI method where the learning objective combines the score matching loss and the proximal penalty term in iterations, which helps our method avoid the reparametrized sampling, and allows for noisy unbiased mini-batch scores through stochastic gradients. This in turn makes our method scalable to large-scale neural networks including Vision Transformers, and allows for richer variational density families. On several benchmarks including visual recognition and time-series forecasting with large-scale deep networks, we empirically show the effectiveness of our approach.

Related papers

Quantification of Uncertainties in Probabilistic Deep Neural Network by Implementing Boosting of Variational Inference [0.38366697175402226]
Boosted Bayesian Neural Networks (BBNN) is a novel approach that enhances neural network weight distribution approximations.<n>BBNN achieves 5% higher accuracy compared to conventional neural networks.
arXiv Detail & Related papers (2025-03-18T05:11:21Z)
Implicit Variational Inference for High-Dimensional Posteriors [7.924706533725115]
In variational inference, the benefits of Bayesian models rely on accurately capturing the true posterior distribution. We propose using neural samplers that specify implicit distributions, which are well-suited for approximating complex multimodal and correlated posteriors. Our approach introduces novel bounds for approximate inference using implicit distributions by locally linearising the neural sampler.
arXiv Detail & Related papers (2023-10-10T14:06:56Z)
Optimization Guarantees of Unfolded ISTA and ADMM Networks With Smooth Soft-Thresholding [57.71603937699949]
We study optimization guarantees, i.e., achieving near-zero training loss with the increase in the number of learning epochs. We show that the threshold on the number of training samples increases with the increase in the network width.
arXiv Detail & Related papers (2023-09-12T13:03:47Z)
Variational Inference: Posterior Threshold Improves Network Clustering Accuracy in Sparse Regimes [2.5782420501870296]
This paper proposes a simple way to improve the variational inference method by hard thresholding the posterior of the community assignment after each iteration. We show that the proposed method converges and can accurately recover the true community labels, even when the average node degree of the network is bounded.
arXiv Detail & Related papers (2023-01-12T00:24:54Z)
Variational Neural Networks [88.24021148516319]
We propose a method for uncertainty estimation in neural networks called Variational Neural Network (VNN) VNN generates parameters for the output distribution of a layer by transforming its inputs with learnable sub-layers. In uncertainty quality estimation experiments, we show that VNNs achieve better uncertainty quality than Monte Carlo Dropout or Bayes By Backpropagation methods.
arXiv Detail & Related papers (2022-07-04T15:41:02Z)
A Variational Bayesian Approach to Learning Latent Variables for Acoustic Knowledge Transfer [55.20627066525205]
We propose a variational Bayesian (VB) approach to learning distributions of latent variables in deep neural network (DNN) models. Our proposed VB approach can obtain good improvements on target devices, and consistently outperforms 13 state-of-the-art knowledge transfer algorithms.
arXiv Detail & Related papers (2021-10-16T15:54:01Z)
Sampling-free Variational Inference for Neural Networks with Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference. Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z)
Encoding the latent posterior of Bayesian Neural Networks for uncertainty quantification [10.727102755903616]
We aim for efficient deep BNNs amenable to complex computer vision architectures. We achieve this by leveraging variational autoencoders (VAEs) to learn the interaction and the latent distribution of the parameters at each network layer. Our approach, Latent-Posterior BNN (LP-BNN), is compatible with the recent BatchEnsemble method, leading to highly efficient (in terms of computation and memory during both training and testing) ensembles.
arXiv Detail & Related papers (2020-12-04T19:50:09Z)
Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model. This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs) The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z)
Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction. We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data. Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.