Distribution-sensitive Information Retention for Accurate Binary Neural
Network
- URL: http://arxiv.org/abs/2109.12338v1
- Date: Sat, 25 Sep 2021 10:59:39 GMT
- Title: Distribution-sensitive Information Retention for Accurate Binary Neural
Network
- Authors: Haotong Qin, Xiangguo Zhang, Ruihao Gong, Yifu Ding, Yi Xu,
XianglongLiu
- Abstract summary: We present a novel Distribution-sensitive Information Retention Network (DIR-Net) to retain the information of the forward activations and backward gradients.
Our DIR-Net consistently outperforms the SOTA binarization approaches under mainstream and compact architectures.
We conduct our DIR-Net on real-world resource-limited devices which achieves 11.1 times storage saving and 5.4 times speedup.
- Score: 49.971345958676196
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Model binarization is an effective method of compressing neural networks and
accelerating their inference process, which enables state-of-the-art models to
run on resource-limited devices. However, a significant performance gap still
exists between the 1-bit model and the 32-bit one. The empirical study shows
that binarization causes a great loss of information in the forward and
backward propagation which harms the performance of binary neural networks
(BNNs), and the limited information representation ability of binarized
parameter is one of the bottlenecks of BNN performance. We present a novel
Distribution-sensitive Information Retention Network (DIR-Net) to retain the
information of the forward activations and backward gradients, which improves
BNNs by distribution-sensitive optimization without increasing the overhead in
the inference process. The DIR-Net mainly relies on two technical
contributions: (1) Information Maximized Binarization (IMB): minimizing the
information loss and the quantization error of weights/activations
simultaneously by balancing and standardizing the weight distribution in the
forward propagation; (2) Distribution-sensitive Two-stage Estimator (DTE):
minimizing the information loss of gradients by gradual distribution-sensitive
approximation of the sign function in the backward propagation, jointly
considering the updating capability and accurate gradient. The DIR-Net
investigates both forward and backward processes of BNNs from the unified
information perspective, thereby provides new insight into the mechanism of
network binarization. Comprehensive experiments on CIFAR-10 and ImageNet
datasets show our DIR-Net consistently outperforms the SOTA binarization
approaches under mainstream and compact architectures. Additionally, we conduct
our DIR-Net on real-world resource-limited devices which achieves 11.1 times
storage saving and 5.4 times speedup.
Related papers
- BDC-Occ: Binarized Deep Convolution Unit For Binarized Occupancy Network [55.21288428359509]
Existing 3D occupancy networks demand significant hardware resources, hindering the deployment of edge devices.
We propose a novel binarized deep convolution (BDC) unit that effectively enhances performance while increasing the number of binarized convolutional layers.
Our BDC-Occ model is created by applying the proposed BDC unit to binarize the existing 3D occupancy networks.
arXiv Detail & Related papers (2024-05-27T10:44:05Z) - IB-AdCSCNet:Adaptive Convolutional Sparse Coding Network Driven by Information Bottleneck [4.523653503622693]
We introduce IB-AdCSCNet, a deep learning model grounded in information bottleneck theory.
IB-AdCSCNet seamlessly integrates the information bottleneck trade-off strategy into deep networks.
Experimental results on CIFAR-10 and CIFAR-100 datasets demonstrate that IB-AdCSCNet not only matches the performance of deep residual convolutional networks but also outperforms them when handling corrupted data.
arXiv Detail & Related papers (2024-05-23T05:35:57Z) - BiHRNet: A Binary high-resolution network for Human Pose Estimation [11.250422970707415]
We propose a binary human pose estimator named BiHRNet, whose weights and activations are expressed as $pm$1.
BiHRNet retains the keypoint extraction ability of HRNet, while using fewer computing resources by adapting binary neural network (BNN)
We show BiHRNet achieves a PCKh of 87.9 on the MPII dataset, which outperforms all binary pose estimation networks.
arXiv Detail & Related papers (2023-11-17T03:01:37Z) - Accelerating Scalable Graph Neural Network Inference with Node-Adaptive
Propagation [80.227864832092]
Graph neural networks (GNNs) have exhibited exceptional efficacy in a diverse array of applications.
The sheer size of large-scale graphs presents a significant challenge to real-time inference with GNNs.
We propose an online propagation framework and two novel node-adaptive propagation methods.
arXiv Detail & Related papers (2023-10-17T05:03:00Z) - BiFSMNv2: Pushing Binary Neural Networks for Keyword Spotting to
Real-Network Performance [54.214426436283134]
Deep neural networks, such as the Deep-FSMN, have been widely studied for keyword spotting (KWS) applications.
We present a strong yet efficient binary neural network for KWS, namely BiFSMNv2, pushing it to the real-network accuracy performance.
We highlight that benefiting from the compact architecture and optimized hardware kernel, BiFSMNv2 can achieve an impressive 25.1x speedup and 20.2x storage-saving on edge hardware.
arXiv Detail & Related papers (2022-11-13T18:31:45Z) - IR2Net: Information Restriction and Information Recovery for Accurate
Binary Neural Networks [24.42067007684169]
Weight and activation binarization can efficiently compress deep neural networks and accelerate model inference, but cause severe accuracy degradation.
We propose IR$2$Net to stimulate the potential of BNNs and improve the network accuracy by restricting the input information and recovering the feature information.
Experimental results demonstrate that our approach still achieves comparable accuracy even with $ sim $10x floating-point operations (FLOPs) reduction for ResNet-18.
arXiv Detail & Related papers (2022-10-06T02:03:26Z) - Bimodal Distributed Binarized Neural Networks [3.0778860202909657]
Binarization techniques, however, suffer from ineligible performance degradation compared to their full-precision counterparts.
We propose a Bi-Modal Distributed binarization method (methodname)
That imposes bi-modal distribution of the network weights by kurtosis regularization.
arXiv Detail & Related papers (2022-04-05T06:07:05Z) - BiFSMN: Binary Neural Network for Keyword Spotting [47.46397208920726]
BiFSMN is an accurate and extreme-efficient binary neural network for KWS.
We show that BiFSMN can achieve an impressive 22.3x speedup and 15.5x storage-saving on real-world edge hardware.
arXiv Detail & Related papers (2022-02-14T05:16:53Z) - ReActNet: Towards Precise Binary Neural Network with Generalized
Activation Functions [76.05981545084738]
We propose several ideas for enhancing a binary network to close its accuracy gap from real-valued networks without incurring any additional computational cost.
We first construct a baseline network by modifying and binarizing a compact real-valued network with parameter-free shortcuts.
We show that the proposed ReActNet outperforms all the state-of-the-arts by a large margin.
arXiv Detail & Related papers (2020-03-07T02:12:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.