Related papers: Impact of Parameter Sparsity on Stochastic Gradient MCMC Methods for Bayesian Deep Learning

Impact of Parameter Sparsity on Stochastic Gradient MCMC Methods for Bayesian Deep Learning

URL: http://arxiv.org/abs/2202.03770v1
Date: Tue, 8 Feb 2022 10:34:05 GMT
Title: Impact of Parameter Sparsity on Stochastic Gradient MCMC Methods for Bayesian Deep Learning
Authors: Meet P. Vadera, Adam D. Cobb, Brian Jalaian, Benjamin M. Marlin
Abstract summary: We investigate the potential of sparse network structures to flexibly trade-off storage costs and inference run time. We show that certain classes of randomly selected substructures can perform as well as substructures derived from state-of-the-art iterative pruning methods.
Score: 15.521736934292354
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Bayesian methods hold significant promise for improving the uncertainty quantification ability and robustness of deep neural network models. Recent research has seen the investigation of a number of approximate Bayesian inference methods for deep neural networks, building on both the variational Bayesian and Markov chain Monte Carlo (MCMC) frameworks. A fundamental issue with MCMC methods is that the improvements they enable are obtained at the expense of increased computation time and model storage costs. In this paper, we investigate the potential of sparse network structures to flexibly trade-off model storage costs and inference run time against predictive performance and uncertainty quantification ability. We use stochastic gradient MCMC methods as the core Bayesian inference method and consider a variety of approaches for selecting sparse network structures. Surprisingly, our results show that certain classes of randomly selected substructures can perform as well as substructures derived from state-of-the-art iterative pruning methods while drastically reducing model training times.

Related papers

Amortized Bayesian Multilevel Models [9.831471158899644]
Multilevel models (MLMs) are a central building block of the Bayesian workflow. MLMs pose significant computational challenges, often rendering their estimation and evaluation intractable within reasonable time constraints. Recent advances in simulation-based inference offer promising solutions for addressing complex probabilistic models using deep generative networks. We explore a family of neural network architectures that leverage the probabilistic factorization of multilevel models to facilitate efficient neural network training and subsequent near-instant posterior inference on unseen data sets.
arXiv Detail & Related papers (2024-08-23T17:11:04Z)
Linear Noise Approximation Assisted Bayesian Inference on Mechanistic Model of Partially Observed Stochastic Reaction Network [2.325005809983534]
This paper develops an efficient Bayesian inference approach for partially observed enzymatic reaction network (SRN) An interpretable linear noise approximation (LNA) metamodel is proposed to approximate the likelihood of observations. An efficient posterior sampling approach is developed by utilizing the gradients of the derived likelihood to speed up the convergence of Markov Chain Monte Carlo.
arXiv Detail & Related papers (2024-05-05T01:54:21Z)
Fast Value Tracking for Deep Reinforcement Learning [7.648784748888187]
Reinforcement learning (RL) tackles sequential decision-making problems by creating agents that interact with their environment. Existing algorithms often view these problem as static, focusing on point estimates for model parameters to maximize expected rewards. Our research leverages the Kalman paradigm to introduce a novel quantification and sampling algorithm called Langevinized Kalman TemporalTD.
arXiv Detail & Related papers (2024-03-19T22:18:19Z)
Online Variational Sequential Monte Carlo [49.97673761305336]
We build upon the variational sequential Monte Carlo (VSMC) method, which provides computationally efficient and accurate model parameter estimation and Bayesian latent-state inference. Online VSMC is capable of performing efficiently, entirely on-the-fly, both parameter estimation and particle proposal adaptation.
arXiv Detail & Related papers (2023-12-19T21:45:38Z)
Bayesian neural networks via MCMC: a Python-based tutorial [0.196629787330046]
Variational inference and Markov Chain Monte-Carlo sampling methods are used to implement Bayesian inference. This tutorial provides code in Python with data and instructions that enable their use and extension.
arXiv Detail & Related papers (2023-04-02T02:19:15Z)
FiLM-Ensemble: Probabilistic Deep Learning via Feature-wise Linear Modulation [69.34011200590817]
We introduce FiLM-Ensemble, a deep, implicit ensemble method based on the concept of Feature-wise Linear Modulation. By modulating the network activations of a single deep network with FiLM, one obtains a model ensemble with high diversity. We show that FiLM-Ensemble outperforms other implicit ensemble methods, and it comes very close to the upper bound of an explicit ensemble of networks.
arXiv Detail & Related papers (2022-05-31T18:33:15Z)
NUQ: Nonparametric Uncertainty Quantification for Deterministic Neural Networks [151.03112356092575]
We show the principled way to measure the uncertainty of predictions for a classifier based on Nadaraya-Watson's nonparametric estimate of the conditional label distribution. We demonstrate the strong performance of the method in uncertainty estimation tasks on a variety of real-world image datasets.
arXiv Detail & Related papers (2022-02-07T12:30:45Z)
Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks. This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z)
What Are Bayesian Neural Network Posteriors Really Like? [63.950151520585024]
We show that Hamiltonian Monte Carlo can achieve significant performance gains over standard and deep ensembles. We also show that deep distributions are similarly close to HMC as standard SGLD, and closer than standard variational inference.
arXiv Detail & Related papers (2021-04-29T15:38:46Z)
Bayesian graph convolutional neural networks via tempered MCMC [0.41998444721319217]
Deep learning models, such as convolutional neural networks, have long been applied to image and multi-media tasks. More recently, there has been more attention to unstructured data that can be represented via graphs. These types of data are often found in health and medicine, social networks, and research data repositories.
arXiv Detail & Related papers (2021-04-17T04:03:25Z)
Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model. This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs) The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.