Sparsifying Binary Networks
- URL: http://arxiv.org/abs/2207.04974v1
- Date: Mon, 11 Jul 2022 15:54:41 GMT
- Title: Sparsifying Binary Networks
- Authors: Riccardo Schiavone and Maria A. Zuluaga
- Abstract summary: Binary neural networks (BNNs) have demonstrated their ability to solve complex tasks with comparable accuracy as full-precision deep neural networks (DNNs)
Despite the recent improvements, they suffer from a fixed and limited compression factor that may result insufficient for certain devices with very limited resources.
We propose sparse binary neural networks (SBNNs), a novel model and training scheme which introduces sparsity in BNNs and a new quantization function for binarizing the network's weights.
- Score: 3.8350038566047426
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Binary neural networks (BNNs) have demonstrated their ability to solve
complex tasks with comparable accuracy as full-precision deep neural networks
(DNNs), while also reducing computational power and storage requirements and
increasing the processing speed. These properties make them an attractive
alternative for the development and deployment of DNN-based applications in
Internet-of-Things (IoT) devices. Despite the recent improvements, they suffer
from a fixed and limited compression factor that may result insufficient for
certain devices with very limited resources. In this work, we propose sparse
binary neural networks (SBNNs), a novel model and training scheme which
introduces sparsity in BNNs and a new quantization function for binarizing the
network's weights. The proposed SBNN is able to achieve high compression
factors and it reduces the number of operations and parameters at inference
time. We also provide tools to assist the SBNN design, while respecting
hardware resource constraints. We study the generalization properties of our
method for different compression factors through a set of experiments on linear
and convolutional networks on three datasets. Our experiments confirm that
SBNNs can achieve high compression rates, without compromising generalization,
while further reducing the operations of BNNs, making SBNNs a viable option for
deploying DNNs in cheap, low-cost, limited-resources IoT devices and sensors.
Related papers
- NAS-BNN: Neural Architecture Search for Binary Neural Networks [55.058512316210056]
We propose a novel neural architecture search scheme for binary neural networks, named NAS-BNN.
Our discovered binary model family outperforms previous BNNs for a wide range of operations (OPs) from 20M to 200M.
In addition, we validate the transferability of these searched BNNs on the object detection task, and our binary detectors with the searched BNNs achieve a novel state-of-the-art result, e.g., 31.6% mAP with 370M OPs, on MS dataset.
arXiv Detail & Related papers (2024-08-28T02:17:58Z) - Hardware-Aware DNN Compression via Diverse Pruning and Mixed-Precision
Quantization [1.0235078178220354]
We propose an automated framework to compress Deep Neural Networks (DNNs) in a hardware-aware manner by jointly employing pruning and quantization.
Our framework achieves $39%$ average energy reduction for datasets $1.7%$ average accuracy loss and outperforms significantly the state-of-the-art approaches.
arXiv Detail & Related papers (2023-12-23T18:50:13Z) - An Automata-Theoretic Approach to Synthesizing Binarized Neural Networks [13.271286153792058]
Quantized neural networks (QNNs) have been developed, with binarized neural networks (BNNs) restricted to binary values as a special case.
This paper presents an automata-theoretic approach to synthesizing BNNs that meet designated properties.
arXiv Detail & Related papers (2023-07-29T06:27:28Z) - Binary domain generalization for sparsifying binary neural networks [3.2462411268263964]
Binary neural networks (BNNs) are an attractive solution for developing and deploying deep neural network (DNN)-based applications in resource constrained devices.
Weight pruning of BNNs leads to performance degradation, which suggests that the standard binarization domain of BNNs is not well adapted for the task.
This work proposes a novel more general binary domain that extends the standard binary one that is more robust to pruning techniques.
arXiv Detail & Related papers (2023-06-23T14:32:16Z) - Basic Binary Convolution Unit for Binarized Image Restoration Network [146.0988597062618]
In this study, we reconsider components in binary convolution, such as residual connection, BatchNorm, activation function, and structure, for image restoration tasks.
Based on our findings and analyses, we design a simple yet efficient basic binary convolution unit (BBCU)
Our BBCU significantly outperforms other BNNs and lightweight models, which shows that BBCU can serve as a basic unit for binarized IR networks.
arXiv Detail & Related papers (2022-10-02T01:54:40Z) - Sub-bit Neural Networks: Learning to Compress and Accelerate Binary
Neural Networks [72.81092567651395]
Sub-bit Neural Networks (SNNs) are a new type of binary quantization design tailored to compress and accelerate BNNs.
SNNs are trained with a kernel-aware optimization framework, which exploits binary quantization in the fine-grained convolutional kernel space.
Experiments on visual recognition benchmarks and the hardware deployment on FPGA validate the great potentials of SNNs.
arXiv Detail & Related papers (2021-10-18T11:30:29Z) - Quantized Neural Networks via {-1, +1} Encoding Decomposition and
Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks.
We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z) - Block-term Tensor Neural Networks [29.442026567710435]
We show that block-term tensor layers (BT-layers) can be easily adapted to neural network models, such as CNNs and RNNs.
BT-layers in CNNs and RNNs can achieve a very large compression ratio on the number of parameters while preserving or improving the representation power of the original DNNs.
arXiv Detail & Related papers (2020-10-10T09:58:43Z) - Progressive Tandem Learning for Pattern Recognition with Deep Spiking
Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency.
We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z) - Widening and Squeezing: Towards Accurate and Efficient QNNs [125.172220129257]
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters.
Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques.
We address this problem by projecting features in original full-precision networks to high-dimensional quantization features.
arXiv Detail & Related papers (2020-02-03T04:11:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.