Multilevel Bayesin Deep Neural Networks
- URL: http://arxiv.org/abs/2203.12961v1
- Date: Thu, 24 Mar 2022 09:49:27 GMT
- Title: Multilevel Bayesin Deep Neural Networks
- Authors: Neil K. Chada, Ajay Jasra, Kody J. H. Law, Sumeetpal S. Singh
- Abstract summary: We consider inference associated with deep neural networks (NNs) and in particular,-class neural network (TNN) priors.
For this work we develop multilevel Monte Carlo (MLMC) methods for such models.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this article we consider Bayesian inference associated to deep neural
networks (DNNs) and in particular, trace-class neural network (TNN) priors
which were proposed by Sell et al. [39]. Such priors were developed as more
robust alternatives to classical architectures in the context of inference
problems. For this work we develop multilevel Monte Carlo (MLMC) methods for
such models. MLMC is a popular variance reduction technique, with particular
applications in Bayesian statistics and uncertainty quantification. We show how
a particular advanced MLMC method that was introduced in [4] can be applied to
Bayesian inference from DNNs and establish mathematically, that the
computational cost to achieve a particular mean square error, associated to
posterior expectation computation, can be reduced by several orders, versus
more conventional techniques. To verify such results we provide numerous
numerical experiments on model problems arising in machine learning. These
include Bayesian regression, as well as Bayesian classification and
reinforcement learning.
Related papers
- Scalable Bayesian Inference in the Era of Deep Learning: From Gaussian Processes to Deep Neural Networks [0.5827521884806072]
Large neural networks trained on large datasets have become the dominant paradigm in machine learning.
This thesis develops scalable methods to equip neural networks with model uncertainty.
arXiv Detail & Related papers (2024-04-29T23:38:58Z) - Bayesian neural networks via MCMC: a Python-based tutorial [0.17999333451993949]
Variational inference and Markov Chain Monte-Carlo sampling methods are used to implement Bayesian inference.
This tutorial provides code in Python with data and instructions that enable their use and extension.
arXiv Detail & Related papers (2023-04-02T02:19:15Z) - Scalable computation of prediction intervals for neural networks via
matrix sketching [79.44177623781043]
Existing algorithms for uncertainty estimation require modifying the model architecture and training procedure.
This work proposes a new algorithm that can be applied to a given trained neural network and produces approximate prediction intervals.
arXiv Detail & Related papers (2022-05-06T13:18:31Z) - Low-bit Quantization of Recurrent Neural Network Language Models Using
Alternating Direction Methods of Multipliers [67.688697838109]
This paper presents a novel method to train quantized RNNLMs from scratch using alternating direction methods of multipliers (ADMM)
Experiments on two tasks suggest the proposed ADMM quantization achieved a model size compression factor of up to 31 times over the full precision baseline RNNLMs.
arXiv Detail & Related papers (2021-11-29T09:30:06Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - What Are Bayesian Neural Network Posteriors Really Like? [63.950151520585024]
We show that Hamiltonian Monte Carlo can achieve significant performance gains over standard and deep ensembles.
We also show that deep distributions are similarly close to HMC as standard SGLD, and closer than standard variational inference.
arXiv Detail & Related papers (2021-04-29T15:38:46Z) - Novel Deep neural networks for solving Bayesian statistical inverse [0.0]
We introduce a fractional deep neural network based approach for the forward solves within an MCMC routine.
We discuss some approximation error estimates and illustrate the efficiency of our approach via several numerical examples.
arXiv Detail & Related papers (2021-02-08T02:54:46Z) - Variational Bayes Neural Network: Posterior Consistency, Classification
Accuracy and Computational Challenges [0.3867363075280544]
This paper develops a variational Bayesian neural network estimation methodology and related statistical theory.
The development is motivated by an important biomedical engineering application, namely building predictive tools for the transition from mild cognitive impairment to Alzheimer's disease.
arXiv Detail & Related papers (2020-11-19T00:11:27Z) - Improving predictions of Bayesian neural nets via local linearization [79.21517734364093]
We argue that the Gauss-Newton approximation should be understood as a local linearization of the underlying Bayesian neural network (BNN)
Because we use this linearized model for posterior inference, we should also predict using this modified model instead of the original one.
We refer to this modified predictive as "GLM predictive" and show that it effectively resolves common underfitting problems of the Laplace approximation.
arXiv Detail & Related papers (2020-08-19T12:35:55Z) - Amortized Bayesian Inference for Models of Cognition [0.1529342790344802]
Recent advances in simulation-based inference using specialized neural network architectures circumvent many previous problems of approximate Bayesian computation.
We provide a general introduction to amortized Bayesian parameter estimation and model comparison.
arXiv Detail & Related papers (2020-05-08T08:12:15Z) - Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model.
This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs)
The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.