Bayesian Computation in Deep Learning
- URL: http://arxiv.org/abs/2502.18300v4
- Date: Sun, 19 Oct 2025 13:17:45 GMT
- Title: Bayesian Computation in Deep Learning
- Authors: Wenlong Chen, Bolian Li, Ruqi Zhang, Yingzhen Li,
- Abstract summary: We provide an introduction to approximate inference techniques as Bayesian methods applied to deep learning models.<n>We review two arguably most popular approximate Bayesian computational methods, Markov chain Monte Carlo (SGMCMC) and variational inference (VI)
- Score: 38.08608882596241
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Bayesian methods have shown success in deep learning applications. For example, in predictive tasks, Bayesian neural networks leverage Bayesian reasoning of model uncertainty to improve the reliability and uncertainty awareness of deep neural networks. In generative modeling domain, many widely used deep generative models, such as deep latent variable models, require approximate Bayesian inference to infer their latent variables for the training. In this chapter, we provide an introduction to approximate inference techniques as Bayesian computation methods applied to deep learning models, with a focus on Bayesian neural networks and deep generative models. We review two arguably most popular approximate Bayesian computational methods, stochastic gradient Markov chain Monte Carlo (SG-MCMC) and variational inference (VI), and explain their unique challenges in posterior inference as well as the solutions when applied to deep learning models.
Related papers
- Amortising Inference and Meta-Learning Priors in Neural Networks [6.79694176040179]
It is not clear how to represent beliefs about a prediction task by prior distributions over model parameters.<n>We introduce a way to perform per-dataset amortised variational inference.<n>This unique model allows us to study the behaviour of Bayesian neural networks under well-specified priors.
arXiv Detail & Related papers (2026-02-09T15:24:07Z) - Inference-Time Alignment in Diffusion Models with Reward-Guided Generation: Tutorial and Review [59.856222854472605]
This tutorial provides an in-depth guide on inference-time guidance and alignment methods for optimizing downstream reward functions in diffusion models.<n> practical applications in fields such as biology often require sample generation that maximizes specific metrics.<n>We discuss (1) fine-tuning methods combined with inference-time techniques, (2) inference-time algorithms based on search algorithms such as Monte Carlo tree search, and (3) connections between inference-time algorithms in language models and diffusion models.
arXiv Detail & Related papers (2025-01-16T17:37:35Z) - Flow-based Sampling for Entanglement Entropy and the Machine Learning of Defects [38.18440341418837]
We introduce a novel technique to numerically calculate R'enyi entanglement entropies in lattice quantum field theory using generative models.
We describe how flow-based approaches can be combined with the replica trick using a custom neural-network architecture around a lattice defect connecting two replicas.
arXiv Detail & Related papers (2024-10-18T13:51:25Z) - Scalable Bayesian Inference in the Era of Deep Learning: From Gaussian Processes to Deep Neural Networks [0.5827521884806072]
Large neural networks trained on large datasets have become the dominant paradigm in machine learning.
This thesis develops scalable methods to equip neural networks with model uncertainty.
arXiv Detail & Related papers (2024-04-29T23:38:58Z) - Applied Causal Inference Powered by ML and AI [54.88868165814996]
The book presents ideas from classical structural equation models (SEMs) and their modern AI equivalent, directed acyclical graphs (DAGs) and structural causal models (SCMs)
It covers Double/Debiased Machine Learning methods to do inference in such models using modern predictive tools.
arXiv Detail & Related papers (2024-03-04T20:28:28Z) - Bayesian neural networks via MCMC: a Python-based tutorial [0.196629787330046]
Variational inference and Markov Chain Monte-Carlo sampling methods are used to implement Bayesian inference.
This tutorial provides code in Python with data and instructions that enable their use and extension.
arXiv Detail & Related papers (2023-04-02T02:19:15Z) - Classified as unknown: A novel Bayesian neural network [0.0]
We develop a new efficient Bayesian learning algorithm for fully connected neural networks.
We generalize the algorithm for a single perceptron for binary classification in citeH to multi-layer perceptrons for multi-class classification.
arXiv Detail & Related papers (2023-01-31T04:27:09Z) - Bayesian Learning for Neural Networks: an algorithmic survey [95.42181254494287]
This self-contained survey engages and introduces readers to the principles and algorithms of Bayesian Learning for Neural Networks.
It provides an introduction to the topic from an accessible, practical-algorithmic perspective.
arXiv Detail & Related papers (2022-11-21T21:36:58Z) - BayesCap: Bayesian Identity Cap for Calibrated Uncertainty in Frozen
Neural Networks [50.15201777970128]
We propose BayesCap that learns a Bayesian identity mapping for the frozen model, allowing uncertainty estimation.
BayesCap is a memory-efficient method that can be trained on a small fraction of the original dataset.
We show the efficacy of our method on a wide variety of tasks with a diverse set of architectures.
arXiv Detail & Related papers (2022-07-14T12:50:09Z) - Quasi Black-Box Variational Inference with Natural Gradients for
Bayesian Learning [84.90242084523565]
We develop an optimization algorithm suitable for Bayesian learning in complex models.
Our approach relies on natural gradient updates within a general black-box framework for efficient training with limited model-specific derivations.
arXiv Detail & Related papers (2022-05-23T18:54:27Z) - Multilevel Bayesian Deep Neural Networks [0.5892638927736115]
We consider inference associated with deep neural networks (DNNs) and in particular, trace-class neural network (TNN) priors.
TNN priors are defined on functions with infinitely many hidden units, and have strongly convergent approximations with finitely many hidden units.
In this paper, we leverage the strong convergence of TNN in order to apply Multilevel Monte Carlo (MLMC) to these models.
arXiv Detail & Related papers (2022-03-24T09:49:27Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Priors in Bayesian Deep Learning: A Review [4.020523898765405]
We highlight the importance of prior choices for Bayesian deep learning.
We present an overview of different priors that have been proposed for (deep) Gaussian processes, variational autoencoders, and Bayesian neural networks.
arXiv Detail & Related papers (2021-05-14T14:53:30Z) - Exploring Bayesian Deep Learning for Urgent Instructor Intervention Need
in MOOC Forums [58.221459787471254]
Massive Open Online Courses (MOOCs) have become a popular choice for e-learning thanks to their great flexibility.
Due to large numbers of learners and their diverse backgrounds, it is taxing to offer real-time support.
With the large volume of posts and high workloads for MOOC instructors, it is unlikely that the instructors can identify all learners requiring intervention.
This paper explores for the first time Bayesian deep learning on learner-based text posts with two methods: Monte Carlo Dropout and Variational Inference.
arXiv Detail & Related papers (2021-04-26T15:12:13Z) - Bayesian graph convolutional neural networks via tempered MCMC [0.41998444721319217]
Deep learning models, such as convolutional neural networks, have long been applied to image and multi-media tasks.
More recently, there has been more attention to unstructured data that can be represented via graphs.
These types of data are often found in health and medicine, social networks, and research data repositories.
arXiv Detail & Related papers (2021-04-17T04:03:25Z) - A Survey on Deep Semi-supervised Learning [51.26862262550445]
We first present a taxonomy for deep semi-supervised learning that categorizes existing methods.
We then offer a detailed comparison of these methods in terms of the type of losses, contributions, and architecture differences.
arXiv Detail & Related papers (2021-02-28T16:22:58Z) - A Bayesian Perspective on Training Speed and Model Selection [51.15664724311443]
We show that a measure of a model's training speed can be used to estimate its marginal likelihood.
We verify our results in model selection tasks for linear models and for the infinite-width limit of deep neural networks.
Our results suggest a promising new direction towards explaining why neural networks trained with gradient descent are biased towards functions that generalize well.
arXiv Detail & Related papers (2020-10-27T17:56:14Z) - On Resource-Efficient Bayesian Network Classifiers and Deep Neural
Networks [14.540226579203207]
We present two methods to reduce the complexity of Bayesian network (BN) classifiers.
First, we introduce quantization-aware training using the straight-through gradient estimator to quantize the parameters of BNs to few bits.
Second, we extend a recently proposed differentiable tree-augmented naive Bayes (TAN) structure learning approach by also considering the model size.
arXiv Detail & Related papers (2020-10-22T14:47:55Z) - Information Theoretic Meta Learning with Gaussian Processes [74.54485310507336]
We formulate meta learning using information theoretic concepts; namely, mutual information and the information bottleneck.
By making use of variational approximations to the mutual information, we derive a general and tractable framework for meta learning.
arXiv Detail & Related papers (2020-09-07T16:47:30Z) - Disentangling the Gauss-Newton Method and Approximate Inference for
Neural Networks [96.87076679064499]
We disentangle the generalized Gauss-Newton and approximate inference for Bayesian deep learning.
We find that the Gauss-Newton method simplifies the underlying probabilistic model significantly.
The connection to Gaussian processes enables new function-space inference algorithms.
arXiv Detail & Related papers (2020-07-21T17:42:58Z) - Amortized Bayesian Inference for Models of Cognition [0.1529342790344802]
Recent advances in simulation-based inference using specialized neural network architectures circumvent many previous problems of approximate Bayesian computation.
We provide a general introduction to amortized Bayesian parameter estimation and model comparison.
arXiv Detail & Related papers (2020-05-08T08:12:15Z) - Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model.
This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs)
The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.