Related papers: Improving Bayesian Inference in Deep Neural Networks with Variational Structured Dropout

Improving Bayesian Inference in Deep Neural Networks with Variational Structured Dropout

URL: http://arxiv.org/abs/2102.07927v1
Date: Tue, 16 Feb 2021 02:33:43 GMT
Title: Improving Bayesian Inference in Deep Neural Networks with Variational Structured Dropout
Authors: Son Nguyen and Duong Nguyen and Khai Nguyen and Nhat Ho and Khoat Than and Hung Bui
Abstract summary: We introduce a new variational structured approximation inspired by the interpretation of Dropout training as approximate inference in Bayesian networks. We then propose a novel method called Variational Structured Dropout (VSD) to overcome this limitation. We conduct experiments on standard benchmarks to demonstrate the effectiveness of VSD over state-of-the-art methods on both predictive accuracy and uncertainty estimation.
Score: 19.16094166903702
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Approximate inference in deep Bayesian networks exhibits a dilemma of how to yield high fidelity posterior approximations while maintaining computational efficiency and scalability. We tackle this challenge by introducing a new variational structured approximation inspired by the interpretation of Dropout training as approximate inference in Bayesian probabilistic models. Concretely, we focus on restrictions of the factorized structure of Dropout posterior which is inflexible to capture rich correlations among weight parameters of the true posterior, and we then propose a novel method called Variational Structured Dropout (VSD) to overcome this limitation. VSD employs an orthogonal transformation to learn a structured representation on the variational Dropout noise and consequently induces statistical dependencies in the approximate posterior. We further gain expressive Bayesian modeling for VSD via proposing a hierarchical Dropout procedure that corresponds to the joint inference in a Bayesian network. Moreover, we can scale up VSD to modern deep convolutional networks in a direct way with a low computational cost. Finally, we conduct extensive experiments on standard benchmarks to demonstrate the effectiveness of VSD over state-of-the-art methods on both predictive accuracy and uncertainty estimation.

Related papers

Conditional diffusions for amortized neural posterior estimation [8.37884129644711]
We show the effectiveness of conditional diffusions coupled with high-capacity summary networks for amortized Neural posterior estimation. Our results show improved stability, superior accuracy, and faster training times, even with simpler, shallower models.
arXiv Detail & Related papers (2024-10-24T19:13:13Z)
Adaptive Anomaly Detection in Network Flows with Low-Rank Tensor Decompositions and Deep Unrolling [9.20186865054847]
Anomaly detection (AD) is increasingly recognized as a key component for ensuring the resilience of future communication systems. This work considers AD in network flows using incomplete measurements. We propose a novel block-successive convex approximation algorithm based on a regularized model-fitting objective. Inspired by Bayesian approaches, we extend the model architecture to perform online adaptation to per-flow and per-time-step statistics.
arXiv Detail & Related papers (2024-09-17T19:59:57Z)
Amortized Posterior Sampling with Diffusion Prior Distillation [55.03585818289934]
Amortized Posterior Sampling is a novel variational inference approach for efficient posterior sampling in inverse problems.<n>Our method trains a conditional flow model to minimize the divergence between the variational distribution and the posterior distribution implicitly defined by the diffusion model.<n>Unlike existing methods, our approach is unsupervised, requires no paired training data, and is applicable to both Euclidean and non-Euclidean domains.
arXiv Detail & Related papers (2024-07-25T09:53:12Z)
Implicit Variational Inference for High-Dimensional Posteriors [7.924706533725115]
In variational inference, the benefits of Bayesian models rely on accurately capturing the true posterior distribution. We propose using neural samplers that specify implicit distributions, which are well-suited for approximating complex multimodal and correlated posteriors. Our approach introduces novel bounds for approximate inference using implicit distributions by locally linearising the neural sampler.
arXiv Detail & Related papers (2023-10-10T14:06:56Z)
Improving Transferability of Adversarial Examples via Bayesian Attacks [84.90830931076901]
We introduce a novel extension by incorporating the Bayesian formulation into the model input as well, enabling the joint diversification of both the model input and model parameters. Our method achieves a new state-of-the-art on transfer-based attacks, improving the average success rate on ImageNet and CIFAR-10 by 19.14% and 2.08%, respectively.
arXiv Detail & Related papers (2023-07-21T03:43:07Z)
Variational Laplace Autoencoders [53.08170674326728]
Variational autoencoders employ an amortized inference model to approximate the posterior of latent variables. We present a novel approach that addresses the limited posterior expressiveness of fully-factorized Gaussian assumption. We also present a general framework named Variational Laplace Autoencoders (VLAEs) for training deep generative models.
arXiv Detail & Related papers (2022-11-30T18:59:27Z)
Accelerated replica exchange stochastic gradient Langevin diffusion enhanced Bayesian DeepONet for solving noisy parametric PDEs [7.337247167823921]
We propose a training framework for replica-exchange Langevin diffusion that exploits the neural network architecture of DeepONets. We show that the proposed framework's exploration and exploitation capabilities enable improved training convergence for DeepONets in noisy scenarios. We also show that replica-exchange Langeving Diffusion also improves the DeepONet's mean prediction accuracy in noisy scenarios.
arXiv Detail & Related papers (2021-11-03T19:23:59Z)
Regularizing Variational Autoencoder with Diversity and Uncertainty Awareness [61.827054365139645]
Variational Autoencoder (VAE) approximates the posterior of latent variables based on amortized variational inference. We propose an alternative model, DU-VAE, for learning a more Diverse and less Uncertain latent space.
arXiv Detail & Related papers (2021-10-24T07:58:13Z)
A Variational Bayesian Approach to Learning Latent Variables for Acoustic Knowledge Transfer [55.20627066525205]
We propose a variational Bayesian (VB) approach to learning distributions of latent variables in deep neural network (DNN) models. Our proposed VB approach can obtain good improvements on target devices, and consistently outperforms 13 state-of-the-art knowledge transfer algorithms.
arXiv Detail & Related papers (2021-10-16T15:54:01Z)
Calibration and Uncertainty Quantification of Bayesian Convolutional Neural Networks for Geophysical Applications [0.0]
It is common to incorporate the uncertainty of predictions such subsurface models should provide calibrated probabilities and the associated uncertainties in their predictions. It has been shown that popular Deep Learning-based models are often miscalibrated, and due to their deterministic nature, provide no means to interpret the uncertainty of their predictions. We compare three different approaches obtaining probabilistic models based on convolutional neural networks in a Bayesian formalism.
arXiv Detail & Related papers (2021-05-25T17:54:23Z)
Towards Trustworthy Predictions from Deep Neural Networks with Fast Adversarial Calibration [2.8935588665357077]
We propose an efficient yet general modelling approach for obtaining well-calibrated, trustworthy probabilities for samples obtained after a domain shift. We introduce a new training strategy combining an entropy-encouraging loss term with an adversarial calibration loss term and demonstrate that this results in well-calibrated and technically trustworthy predictions.
arXiv Detail & Related papers (2020-12-20T13:39:29Z)
An Ode to an ODE [78.97367880223254]
We present a new paradigm for Neural ODE algorithms, called ODEtoODE, where time-dependent parameters of the main flow evolve according to a matrix flow on the group O(d) This nested system of two flows provides stability and effectiveness of training and provably solves the gradient vanishing-explosion problem.
arXiv Detail & Related papers (2020-06-19T22:05:19Z)
Learnable Bernoulli Dropout for Bayesian Deep Learning [53.79615543862426]
Learnable Bernoulli dropout (LBD) is a new model-agnostic dropout scheme that considers the dropout rates as parameters jointly optimized with other model parameters. LBD leads to improved accuracy and uncertainty estimates in image classification and semantic segmentation.
arXiv Detail & Related papers (2020-02-12T18:57:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.