Efficient Computation Reduction in Bayesian Neural Networks Through
Feature Decomposition and Memorization
- URL: http://arxiv.org/abs/2005.03857v1
- Date: Fri, 8 May 2020 05:03:04 GMT
- Title: Efficient Computation Reduction in Bayesian Neural Networks Through
Feature Decomposition and Memorization
- Authors: Xiaotao Jia, Jianlei Yang, Runze Liu, Xueyan Wang, Sorin Dan Cotofana,
Weisheng Zhao
- Abstract summary: In this paper, an efficient BNN inference flow is proposed to reduce the computation cost.
About half of the computations could be eliminated compared to the traditional approach.
We implement our approach in Verilog and synthesise it with 45 $nm$ FreePDK technology.
- Score: 10.182119276564643
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Bayesian method is capable of capturing real world
uncertainties/incompleteness and properly addressing the over-fitting issue
faced by deep neural networks. In recent years, Bayesian Neural Networks (BNNs)
have drawn tremendous attentions of AI researchers and proved to be successful
in many applications. However, the required high computation complexity makes
BNNs difficult to be deployed in computing systems with limited power budget.
In this paper, an efficient BNN inference flow is proposed to reduce the
computation cost then is evaluated by means of both software and hardware
implementations. A feature decomposition and memorization (\texttt{DM})
strategy is utilized to reform the BNN inference flow in a reduced manner.
About half of the computations could be eliminated compared to the traditional
approach that has been proved by theoretical analysis and software validations.
Subsequently, in order to resolve the hardware resource limitations, a
memory-friendly computing framework is further deployed to reduce the memory
overhead introduced by \texttt{DM} strategy. Finally, we implement our approach
in Verilog and synthesise it with 45 $nm$ FreePDK technology. Hardware
simulation results on multi-layer BNNs demonstrate that, when compared with the
traditional BNN inference method, it provides an energy consumption reduction
of 73\% and a 4$\times$ speedup at the expense of 14\% area overhead.
Related papers
- GhostRNN: Reducing State Redundancy in RNN with Cheap Operations [66.14054138609355]
We propose an efficient RNN architecture, GhostRNN, which reduces hidden state redundancy with cheap operations.
Experiments on KWS and SE tasks demonstrate that the proposed GhostRNN significantly reduces the memory usage (40%) and computation cost while keeping performance similar.
arXiv Detail & Related papers (2024-11-20T11:37:14Z) - ZOBNN: Zero-Overhead Dependable Design of Binary Neural Networks with Deliberately Quantized Parameters [0.0]
In this paper, we introduce a third advantage of very low-precision neural networks: improved fault-tolerance.
We investigate the impact of memory faults on state-of-the-art binary neural networks (BNNs) through comprehensive analysis.
We propose a technique to improve BNN dependability by restricting the range of float parameters through a novel deliberately uniform quantization.
arXiv Detail & Related papers (2024-07-06T05:31:11Z) - An Automata-Theoretic Approach to Synthesizing Binarized Neural Networks [13.271286153792058]
Quantized neural networks (QNNs) have been developed, with binarized neural networks (BNNs) restricted to binary values as a special case.
This paper presents an automata-theoretic approach to synthesizing BNNs that meet designated properties.
arXiv Detail & Related papers (2023-07-29T06:27:28Z) - QVIP: An ILP-based Formal Verification Approach for Quantized Neural
Networks [14.766917269393865]
Quantization has emerged as a promising technique to reduce the size of neural networks with comparable accuracy as their floating-point numbered counterparts.
We propose a novel and efficient formal verification approach for QNNs.
In particular, we are the first to propose an encoding that reduces the verification problem of QNNs into the solving of integer linear constraints.
arXiv Detail & Related papers (2022-12-10T03:00:29Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Recurrent Bilinear Optimization for Binary Neural Networks [58.972212365275595]
BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors.
Our work is the first attempt to optimize BNNs from the bilinear perspective.
We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
arXiv Detail & Related papers (2022-09-04T06:45:33Z) - Low-bit Shift Network for End-to-End Spoken Language Understanding [7.851607739211987]
We propose the use of power-of-two quantization, which quantizes continuous parameters into low-bit power-of-two values.
This reduces computational complexity by removing expensive multiplication operations and with the use of low-bit weights.
arXiv Detail & Related papers (2022-07-15T14:34:22Z) - Comparative Analysis of Interval Reachability for Robust Implicit and
Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs)
INNs are a class of implicit learning models that use implicit equations as layers.
We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z) - Quantized Neural Networks via {-1, +1} Encoding Decomposition and
Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks.
We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z) - Improved Branch and Bound for Neural Network Verification via Lagrangian
Decomposition [161.09660864941603]
We improve the scalability of Branch and Bound (BaB) algorithms for formally proving input-output properties of neural networks.
We present a novel activation-based branching strategy and a BaB framework, named Branch and Dual Network Bound (BaDNB)
BaDNB outperforms previous complete verification systems by a large margin, cutting average verification times by factors up to 50 on adversarial properties.
arXiv Detail & Related papers (2021-04-14T09:22:42Z) - Encoding the latent posterior of Bayesian Neural Networks for
uncertainty quantification [10.727102755903616]
We aim for efficient deep BNNs amenable to complex computer vision architectures.
We achieve this by leveraging variational autoencoders (VAEs) to learn the interaction and the latent distribution of the parameters at each network layer.
Our approach, Latent-Posterior BNN (LP-BNN), is compatible with the recent BatchEnsemble method, leading to highly efficient (in terms of computation and memory during both training and testing) ensembles.
arXiv Detail & Related papers (2020-12-04T19:50:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.