Masked Bayesian Neural Networks : Theoretical Guarantee and its
Posterior Inference
- URL: http://arxiv.org/abs/2305.14765v1
- Date: Wed, 24 May 2023 06:16:11 GMT
- Title: Masked Bayesian Neural Networks : Theoretical Guarantee and its
Posterior Inference
- Authors: Insung Kong, Dongyoon Yang, Jongjin Lee, Ilsang Ohn, Gyuseung Baek,
Yongdai Kim
- Abstract summary: We propose a new node-sparse BNN model which has good theoretical properties and is computationally feasible.
We prove that the posterior concentration rate to the true model is near minimax optimal and adaptive to the smoothness of the true model.
In addition, we develop a novel MCMC algorithm which makes the Bayesian inference of the node-sparse BNN model feasible in practice.
- Score: 1.2722697496405464
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bayesian approaches for learning deep neural networks (BNN) have been
received much attention and successfully applied to various applications.
Particularly, BNNs have the merit of having better generalization ability as
well as better uncertainty quantification. For the success of BNN, search an
appropriate architecture of the neural networks is an important task, and
various algorithms to find good sparse neural networks have been proposed. In
this paper, we propose a new node-sparse BNN model which has good theoretical
properties and is computationally feasible. We prove that the posterior
concentration rate to the true model is near minimax optimal and adaptive to
the smoothness of the true model. In particular the adaptiveness is the first
of its kind for node-sparse BNNs. In addition, we develop a novel MCMC
algorithm which makes the Bayesian inference of the node-sparse BNN model
feasible in practice.
Related papers
- NAS-BNN: Neural Architecture Search for Binary Neural Networks [55.058512316210056]
We propose a novel neural architecture search scheme for binary neural networks, named NAS-BNN.
Our discovered binary model family outperforms previous BNNs for a wide range of operations (OPs) from 20M to 200M.
In addition, we validate the transferability of these searched BNNs on the object detection task, and our binary detectors with the searched BNNs achieve a novel state-of-the-art result, e.g., 31.6% mAP with 370M OPs, on MS dataset.
arXiv Detail & Related papers (2024-08-28T02:17:58Z) - Bayesian Neural Networks with Domain Knowledge Priors [52.80929437592308]
We propose a framework for integrating general forms of domain knowledge into a BNN prior.
We show that BNNs using our proposed domain knowledge priors outperform those with standard priors.
arXiv Detail & Related papers (2024-02-20T22:34:53Z) - Make Me a BNN: A Simple Strategy for Estimating Bayesian Uncertainty
from Pre-trained Models [40.38541033389344]
Deep Neural Networks (DNNs) are powerful tools for various computer vision tasks, yet they often struggle with reliable uncertainty quantification.
We introduce the Adaptable Bayesian Neural Network (ABNN), a simple and scalable strategy to seamlessly transform DNNs into BNNs.
We conduct extensive experiments across multiple datasets for image classification and semantic segmentation tasks, and our results demonstrate that ABNN achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-12-23T16:39:24Z) - An Automata-Theoretic Approach to Synthesizing Binarized Neural Networks [13.271286153792058]
Quantized neural networks (QNNs) have been developed, with binarized neural networks (BNNs) restricted to binary values as a special case.
This paper presents an automata-theoretic approach to synthesizing BNNs that meet designated properties.
arXiv Detail & Related papers (2023-07-29T06:27:28Z) - Recurrent Bilinear Optimization for Binary Neural Networks [58.972212365275595]
BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors.
Our work is the first attempt to optimize BNNs from the bilinear perspective.
We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
arXiv Detail & Related papers (2022-09-04T06:45:33Z) - Masked Bayesian Neural Networks : Computation and Optimality [1.3649494534428745]
We propose a novel sparse Bayesian neural network (BNN) which searches a good deep neural network with an appropriate complexity.
We employ the masking variables at each node which can turn off some nodes according to the posterior distribution to yield a nodewise sparse DNN.
By analyzing several benchmark datasets, we illustrate that the proposed BNN performs well compared to other existing methods.
arXiv Detail & Related papers (2022-06-02T02:59:55Z) - Comparative Analysis of Interval Reachability for Robust Implicit and
Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs)
INNs are a class of implicit learning models that use implicit equations as layers.
We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z) - A Mixed Integer Programming Approach for Verifying Properties of
Binarized Neural Networks [44.44006029119672]
We propose a mixed integer programming formulation for BNN verification.
We demonstrate our approach by verifying properties of BNNs trained on the MNIST dataset and an aircraft collision avoidance controller.
arXiv Detail & Related papers (2022-03-11T01:11:29Z) - Spatial-Temporal-Fusion BNN: Variational Bayesian Feature Layer [77.78479877473899]
We design a spatial-temporal-fusion BNN for efficiently scaling BNNs to large models.
Compared to vanilla BNNs, our approach can greatly reduce the training time and the number of parameters, which contributes to scale BNNs efficiently.
arXiv Detail & Related papers (2021-12-12T17:13:14Z) - S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural
Networks via Guided Distribution Calibration [74.5509794733707]
We present a novel guided learning paradigm from real-valued to distill binary networks on the final prediction distribution.
Our proposed method can boost the simple contrastive learning baseline by an absolute gain of 5.515% on BNNs.
Our method achieves substantial improvement over the simple contrastive learning baseline, and is even comparable to many mainstream supervised BNN methods.
arXiv Detail & Related papers (2021-02-17T18:59:28Z) - Encoding the latent posterior of Bayesian Neural Networks for
uncertainty quantification [10.727102755903616]
We aim for efficient deep BNNs amenable to complex computer vision architectures.
We achieve this by leveraging variational autoencoders (VAEs) to learn the interaction and the latent distribution of the parameters at each network layer.
Our approach, Latent-Posterior BNN (LP-BNN), is compatible with the recent BatchEnsemble method, leading to highly efficient (in terms of computation and memory during both training and testing) ensembles.
arXiv Detail & Related papers (2020-12-04T19:50:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.