Efficient Variational Inference for Sparse Deep Learning with
Theoretical Guarantee
- URL: http://arxiv.org/abs/2011.07439v1
- Date: Sun, 15 Nov 2020 03:27:54 GMT
- Title: Efficient Variational Inference for Sparse Deep Learning with
Theoretical Guarantee
- Authors: Jincheng Bai, Qifan Song, Guang Cheng
- Abstract summary: Sparse deep learning aims to address the challenge of huge storage consumption by deep neural networks.
In this paper, we train sparse deep neural networks with a fully Bayesian treatment under spike-and-slab priors.
We develop a set of computationally efficient variational inferences via continuous relaxation of Bernoulli distribution.
- Score: 20.294908538266867
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sparse deep learning aims to address the challenge of huge storage
consumption by deep neural networks, and to recover the sparse structure of
target functions. Although tremendous empirical successes have been achieved,
most sparse deep learning algorithms are lacking of theoretical support. On the
other hand, another line of works have proposed theoretical frameworks that are
computationally infeasible. In this paper, we train sparse deep neural networks
with a fully Bayesian treatment under spike-and-slab priors, and develop a set
of computationally efficient variational inferences via continuous relaxation
of Bernoulli distribution. The variational posterior contraction rate is
provided, which justifies the consistency of the proposed variational Bayes
method. Notably, our empirical results demonstrate that this variational
procedure provides uncertainty quantification in terms of Bayesian predictive
distribution and is also capable to accomplish consistent variable selection by
training a sparse multi-layer neural network.
Related papers
- Implicit Generative Prior for Bayesian Neural Networks [8.013264410621357]
We propose a novel neural adaptive empirical Bayes (NA-EB) framework for complex data structures.
The proposed NA-EB framework combines variational inference with a gradient ascent algorithm.
We demonstrate the practical applications of our framework through extensive evaluations on a variety of tasks.
arXiv Detail & Related papers (2024-04-27T21:00:38Z) - Continual Learning via Sequential Function-Space Variational Inference [65.96686740015902]
We propose an objective derived by formulating continual learning as sequential function-space variational inference.
Compared to objectives that directly regularize neural network predictions, the proposed objective allows for more flexible variational distributions.
We demonstrate that, across a range of task sequences, neural networks trained via sequential function-space variational inference achieve better predictive accuracy than networks trained with related methods.
arXiv Detail & Related papers (2023-12-28T18:44:32Z) - Tractable Function-Space Variational Inference in Bayesian Neural
Networks [72.97620734290139]
A popular approach for estimating the predictive uncertainty of neural networks is to define a prior distribution over the network parameters.
We propose a scalable function-space variational inference method that allows incorporating prior information.
We show that the proposed method leads to state-of-the-art uncertainty estimation and predictive performance on a range of prediction tasks.
arXiv Detail & Related papers (2023-12-28T18:33:26Z) - Implicit Variational Inference for High-Dimensional Posteriors [7.924706533725115]
In variational inference, the benefits of Bayesian models rely on accurately capturing the true posterior distribution.
We propose using neural samplers that specify implicit distributions, which are well-suited for approximating complex multimodal and correlated posteriors.
Our approach introduces novel bounds for approximate inference using implicit distributions by locally linearising the neural sampler.
arXiv Detail & Related papers (2023-10-10T14:06:56Z) - Structured Radial Basis Function Network: Modelling Diversity for
Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions.
A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems.
It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z) - Improved uncertainty quantification for neural networks with Bayesian
last layer [0.0]
Uncertainty quantification is an important task in machine learning.
We present a reformulation of the log-marginal likelihood of a NN with BLL which allows for efficient training using backpropagation.
arXiv Detail & Related papers (2023-02-21T20:23:56Z) - Variational Bayes Deep Operator Network: A data-driven Bayesian solver
for parametric differential equations [0.0]
We propose Variational Bayes DeepONet (VB-DeepONet) for operator learning.
VB-DeepONet uses variational inference to take into account high dimensional posterior distributions.
arXiv Detail & Related papers (2022-06-12T04:20:11Z) - Scalable computation of prediction intervals for neural networks via
matrix sketching [79.44177623781043]
Existing algorithms for uncertainty estimation require modifying the model architecture and training procedure.
This work proposes a new algorithm that can be applied to a given trained neural network and produces approximate prediction intervals.
arXiv Detail & Related papers (2022-05-06T13:18:31Z) - Multivariate Deep Evidential Regression [77.34726150561087]
A new approach with uncertainty-aware neural networks shows promise over traditional deterministic methods.
We discuss three issues with a proposed solution to extract aleatoric and epistemic uncertainties from regression-based neural networks.
arXiv Detail & Related papers (2021-04-13T12:20:18Z) - Sampling-free Variational Inference for Neural Networks with
Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference.
Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z) - Scalable Bayesian neural networks by layer-wise input augmentation [20.279668821097918]
We introduce implicit Bayesian neural networks, a simple and scalable approach for uncertainty representation in deep learning.
We present appropriate input distributions and demonstrate state-of-the-art performance in terms of calibration, robustness and uncertainty characterisation over large-scale, multi-million parameter image classification tasks.
arXiv Detail & Related papers (2020-10-26T11:45:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.