Bayesian neural networks via MCMC: a Python-based tutorial
- URL: http://arxiv.org/abs/2304.02595v3
- Date: Mon, 26 Aug 2024 11:35:52 GMT
- Title: Bayesian neural networks via MCMC: a Python-based tutorial
- Authors: Rohitash Chandra, Joshua Simmons,
- Abstract summary: Variational inference and Markov Chain Monte-Carlo sampling methods are used to implement Bayesian inference.
This tutorial provides code in Python with data and instructions that enable their use and extension.
- Score: 0.196629787330046
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Bayesian inference provides a methodology for parameter estimation and uncertainty quantification in machine learning and deep learning methods. Variational inference and Markov Chain Monte-Carlo (MCMC) sampling methods are used to implement Bayesian inference. In the past three decades, MCMC sampling methods have faced some challenges in being adapted to larger models (such as in deep learning) and big data problems. Advanced proposal distributions that incorporate gradients, such as a Langevin proposal distribution, provide a means to address some of the limitations of MCMC sampling for Bayesian neural networks. Furthermore, MCMC methods have typically been constrained to statisticians and currently not well-known among deep learning researchers. We present a tutorial for MCMC methods that covers simple Bayesian linear and logistic models, and Bayesian neural networks. The aim of this tutorial is to bridge the gap between theory and implementation via coding, given a general sparsity of libraries and tutorials to this end. This tutorial provides code in Python with data and instructions that enable their use and extension. We provide results for some benchmark problems showing the strengths and weaknesses of implementing the respective Bayesian models via MCMC. We highlight the challenges in sampling multi-modal posterior distributions for the case of Bayesian neural networks and the need for further improvement of convergence diagnosis methods.
Related papers
- Amortized Bayesian Multilevel Models [9.831471158899644]
Multilevel models (MLMs) are a central building block of the Bayesian workflow.
MLMs pose significant computational challenges, often rendering their estimation and evaluation intractable within reasonable time constraints.
Recent advances in simulation-based inference offer promising solutions for addressing complex probabilistic models using deep generative networks.
We explore a family of neural network architectures that leverage the probabilistic factorization of multilevel models to facilitate efficient neural network training and subsequent near-instant posterior inference on unseen data sets.
arXiv Detail & Related papers (2024-08-23T17:11:04Z) - A Rate-Distortion View of Uncertainty Quantification [36.85921945174863]
In supervised learning, understanding an input's proximity to the training data can help a model decide whether it has sufficient evidence for reaching a reliable prediction.
We introduce Distance Aware Bottleneck (DAB), a new method for enriching deep neural networks with this property.
arXiv Detail & Related papers (2024-06-16T01:33:22Z) - Knowledge Removal in Sampling-based Bayesian Inference [86.14397783398711]
When single data deletion requests come, companies may need to delete the whole models learned with massive resources.
Existing works propose methods to remove knowledge learned from data for explicitly parameterized models.
In this paper, we propose the first machine unlearning algorithm for MCMC.
arXiv Detail & Related papers (2022-03-24T10:03:01Z) - BatchFormer: Learning to Explore Sample Relationships for Robust
Representation Learning [93.38239238988719]
We propose to enable deep neural networks with the ability to learn the sample relationships from each mini-batch.
BatchFormer is applied into the batch dimension of each mini-batch to implicitly explore sample relationships during training.
We perform extensive experiments on over ten datasets and the proposed method achieves significant improvements on different data scarcity applications.
arXiv Detail & Related papers (2022-03-03T05:31:33Z) - Bayesian Structure Learning with Generative Flow Networks [85.84396514570373]
In Bayesian structure learning, we are interested in inferring a distribution over the directed acyclic graph (DAG) from data.
Recently, a class of probabilistic models, called Generative Flow Networks (GFlowNets), have been introduced as a general framework for generative modeling.
We show that our approach, called DAG-GFlowNet, provides an accurate approximation of the posterior over DAGs.
arXiv Detail & Related papers (2022-02-28T15:53:10Z) - Impact of Parameter Sparsity on Stochastic Gradient MCMC Methods for
Bayesian Deep Learning [15.521736934292354]
We investigate the potential of sparse network structures to flexibly trade-off storage costs and inference run time.
We show that certain classes of randomly selected substructures can perform as well as substructures derived from state-of-the-art iterative pruning methods.
arXiv Detail & Related papers (2022-02-08T10:34:05Z) - Learning to Learn to Demodulate with Uncertainty Quantification via
Bayesian Meta-Learning [59.014197664747165]
We introduce the use of Bayesian meta-learning via variational inference for the purpose of obtaining well-calibrated few-pilot demodulators.
The resulting Bayesian ensembles offer better calibrated soft decisions, at the computational cost of running multiple instances of the neural network for demodulation.
arXiv Detail & Related papers (2021-08-02T11:07:46Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Bayesian graph convolutional neural networks via tempered MCMC [0.41998444721319217]
Deep learning models, such as convolutional neural networks, have long been applied to image and multi-media tasks.
More recently, there has been more attention to unstructured data that can be represented via graphs.
These types of data are often found in health and medicine, social networks, and research data repositories.
arXiv Detail & Related papers (2021-04-17T04:03:25Z) - Novel Deep neural networks for solving Bayesian statistical inverse [0.0]
We introduce a fractional deep neural network based approach for the forward solves within an MCMC routine.
We discuss some approximation error estimates and illustrate the efficiency of our approach via several numerical examples.
arXiv Detail & Related papers (2021-02-08T02:54:46Z) - Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model.
This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs)
The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.