Multilevel Bayesian Deep Neural Networks
- URL: http://arxiv.org/abs/2203.12961v4
- Date: Tue, 29 Oct 2024 08:31:24 GMT
- Title: Multilevel Bayesian Deep Neural Networks
- Authors: Neil K. Chada, Ajay Jasra, Kody J. H. Law, Sumeetpal S. Singh,
- Abstract summary: We consider inference associated with deep neural networks (DNNs) and in particular, trace-class neural network (TNN) priors.
TNN priors are defined on functions with infinitely many hidden units, and have strongly convergent approximations with finitely many hidden units.
In this paper, we leverage the strong convergence of TNN in order to apply Multilevel Monte Carlo (MLMC) to these models.
- Score: 0.5892638927736115
- License:
- Abstract: In this article we consider Bayesian inference associated to deep neural networks (DNNs) and in particular, trace-class neural network (TNN) priors which can be preferable to traditional DNNs as (a) they are identifiable and (b) they possess desirable convergence properties. TNN priors are defined on functions with infinitely many hidden units, and have strongly convergent Karhunen-Loeve-type approximations with finitely many hidden units. A practical hurdle is that the Bayesian solution is computationally demanding, requiring simulation methods, so approaches to drive down the complexity are needed. In this paper, we leverage the strong convergence of TNN in order to apply Multilevel Monte Carlo (MLMC) to these models. In particular, an MLMC method that was introduced is used to approximate posterior expectations of Bayesian TNN models with optimal computational complexity, and this is mathematically proved. The results are verified with several numerical experiments on model problems arising in machine learning, including regression, classification, and reinforcement learning.
Related papers
- On Feynman--Kac training of partial Bayesian neural networks [1.6474447977095783]
Partial Bayesian neural networks (pBNNs) were shown to perform competitively with full Bayesian neural networks.
We propose an efficient sampling-based training strategy, wherein the training of a pBNN is formulated as simulating a Feynman--Kac model.
We show that our proposed training scheme outperforms the state of the art in terms of predictive performance.
arXiv Detail & Related papers (2023-10-30T15:03:15Z) - An Automata-Theoretic Approach to Synthesizing Binarized Neural Networks [13.271286153792058]
Quantized neural networks (QNNs) have been developed, with binarized neural networks (BNNs) restricted to binary values as a special case.
This paper presents an automata-theoretic approach to synthesizing BNNs that meet designated properties.
arXiv Detail & Related papers (2023-07-29T06:27:28Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - Satellite Anomaly Detection Using Variance Based Genetic Ensemble of
Neural Networks [7.848121055546167]
We use an efficient ensemble of the predictions from multiple Recurrent Neural Networks (RNNs)
For prediction, each RNN is guided by a Genetic Algorithm (GA) which constructs the optimal structure for each RNN model.
This paper uses the Monte Carlo (MC) dropout as an approximation version of BNNs.
arXiv Detail & Related papers (2023-02-10T22:09:00Z) - Neural Additive Models for Location Scale and Shape: A Framework for
Interpretable Neural Regression Beyond the Mean [1.0923877073891446]
Deep neural networks (DNNs) have proven to be highly effective in a variety of tasks.
Despite this success, the inner workings of DNNs are often not transparent.
This lack of interpretability has led to increased research on inherently interpretable neural networks.
arXiv Detail & Related papers (2023-01-27T17:06:13Z) - Physics-Informed Machine Learning of Dynamical Systems for Efficient
Bayesian Inference [0.0]
No-u-turn sampler (NUTS) is a widely adopted method for performing Bayesian inference.
Hamiltonian neural networks (HNNs) are a noteworthy architecture.
We propose the use of HNNs for performing Bayesian inference efficiently without requiring numerous posterior gradients.
arXiv Detail & Related papers (2022-09-19T21:17:23Z) - Comparative Analysis of Interval Reachability for Robust Implicit and
Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs)
INNs are a class of implicit learning models that use implicit equations as layers.
We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model.
This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs)
The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.