Related papers: Efficient Computation Reduction in Bayesian Neural Networks Through Feature Decomposition and Memorization

Efficient Computation Reduction in Bayesian Neural Networks Through Feature Decomposition and Memorization

URL: http://arxiv.org/abs/2005.03857v1
Date: Fri, 8 May 2020 05:03:04 GMT
Title: Efficient Computation Reduction in Bayesian Neural Networks Through Feature Decomposition and Memorization
Authors: Xiaotao Jia, Jianlei Yang, Runze Liu, Xueyan Wang, Sorin Dan Cotofana, Weisheng Zhao
Abstract summary: In this paper, an efficient BNN inference flow is proposed to reduce the computation cost. About half of the computations could be eliminated compared to the traditional approach. We implement our approach in Verilog and synthesise it with 45 $nm$ FreePDK technology.
Score: 10.182119276564643
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Bayesian method is capable of capturing real world uncertainties/incompleteness and properly addressing the over-fitting issue faced by deep neural networks. In recent years, Bayesian Neural Networks (BNNs) have drawn tremendous attentions of AI researchers and proved to be successful in many applications. However, the required high computation complexity makes BNNs difficult to be deployed in computing systems with limited power budget. In this paper, an efficient BNN inference flow is proposed to reduce the computation cost then is evaluated by means of both software and hardware implementations. A feature decomposition and memorization (\texttt{DM}) strategy is utilized to reform the BNN inference flow in a reduced manner. About half of the computations could be eliminated compared to the traditional approach that has been proved by theoretical analysis and software validations. Subsequently, in order to resolve the hardware resource limitations, a memory-friendly computing framework is further deployed to reduce the memory overhead introduced by \texttt{DM} strategy. Finally, we implement our approach in Verilog and synthesise it with 45 $nm$ FreePDK technology. Hardware simulation results on multi-layer BNNs demonstrate that, when compared with the traditional BNN inference method, it provides an energy consumption reduction of 73\% and a 4$\times$ speedup at the expense of 14\% area overhead.

Related papers

GhostRNN: Reducing State Redundancy in RNN with Cheap Operations [66.14054138609355]
We propose an efficient RNN architecture, GhostRNN, which reduces hidden state redundancy with cheap operations. Experiments on KWS and SE tasks demonstrate that the proposed GhostRNN significantly reduces the memory usage (40%) and computation cost while keeping performance similar.
arXiv Detail & Related papers (2024-11-20T11:37:14Z)
Optimization Proxies using Limited Labeled Data and Training Time -- A Semi-Supervised Bayesian Neural Network Approach [3.26805553822503]
Constrained optimization problems arise in various engineering systems such as inventory management and power grids.<n>Standard deep neural network (DNN) based machine learning proxies are ineffective in practical settings where labeled data is scarce and training times are limited.
arXiv Detail & Related papers (2024-10-04T02:10:20Z)
ZOBNN: Zero-Overhead Dependable Design of Binary Neural Networks with Deliberately Quantized Parameters [0.0]
In this paper, we introduce a third advantage of very low-precision neural networks: improved fault-tolerance. We investigate the impact of memory faults on state-of-the-art binary neural networks (BNNs) through comprehensive analysis. We propose a technique to improve BNN dependability by restricting the range of float parameters through a novel deliberately uniform quantization.
arXiv Detail & Related papers (2024-07-06T05:31:11Z)
An Automata-Theoretic Approach to Synthesizing Binarized Neural Networks [13.271286153792058]
Quantized neural networks (QNNs) have been developed, with binarized neural networks (BNNs) restricted to binary values as a special case. This paper presents an automata-theoretic approach to synthesizing BNNs that meet designated properties.
arXiv Detail & Related papers (2023-07-29T06:27:28Z)
QVIP: An ILP-based Formal Verification Approach for Quantized Neural Networks [14.766917269393865]
Quantization has emerged as a promising technique to reduce the size of neural networks with comparable accuracy as their floating-point numbered counterparts. We propose a novel and efficient formal verification approach for QNNs. In particular, we are the first to propose an encoding that reduces the verification problem of QNNs into the solving of integer linear constraints.
arXiv Detail & Related papers (2022-12-10T03:00:29Z)
Quantization-aware Interval Bound Propagation for Training Certifiably Robust Quantized Neural Networks [58.195261590442406]
We study the problem of training and certifying adversarially robust quantized neural networks (QNNs) Recent work has shown that floating-point neural networks that have been verified to be robust can become vulnerable to adversarial attacks after quantization. We present quantization-aware interval bound propagation (QA-IBP), a novel method for training robust QNNs.
arXiv Detail & Related papers (2022-11-29T13:32:38Z)
Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency. We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z)
Recurrent Bilinear Optimization for Binary Neural Networks [58.972212365275595]
BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors. Our work is the first attempt to optimize BNNs from the bilinear perspective. We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
arXiv Detail & Related papers (2022-09-04T06:45:33Z)
Low-bit Shift Network for End-to-End Spoken Language Understanding [7.851607739211987]
We propose the use of power-of-two quantization, which quantizes continuous parameters into low-bit power-of-two values. This reduces computational complexity by removing expensive multiplication operations and with the use of low-bit weights.
arXiv Detail & Related papers (2022-07-15T14:34:22Z)
Comparative Analysis of Interval Reachability for Robust Implicit and Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs) INNs are a class of implicit learning models that use implicit equations as layers. We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z)
Quantized Neural Networks via {-1, +1} Encoding Decomposition and Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks. We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z)
Improved Branch and Bound for Neural Network Verification via Lagrangian Decomposition [161.09660864941603]
We improve the scalability of Branch and Bound (BaB) algorithms for formally proving input-output properties of neural networks. We present a novel activation-based branching strategy and a BaB framework, named Branch and Dual Network Bound (BaDNB) BaDNB outperforms previous complete verification systems by a large margin, cutting average verification times by factors up to 50 on adversarial properties.
arXiv Detail & Related papers (2021-04-14T09:22:42Z)
Encoding the latent posterior of Bayesian Neural Networks for uncertainty quantification [10.727102755903616]
We aim for efficient deep BNNs amenable to complex computer vision architectures. We achieve this by leveraging variational autoencoders (VAEs) to learn the interaction and the latent distribution of the parameters at each network layer. Our approach, Latent-Posterior BNN (LP-BNN), is compatible with the recent BatchEnsemble method, leading to highly efficient (in terms of computation and memory during both training and testing) ensembles.
arXiv Detail & Related papers (2020-12-04T19:50:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.