Resilient Binary Neural Network
- URL: http://arxiv.org/abs/2302.00956v2
- Date: Sun, 5 Feb 2023 04:52:14 GMT
- Title: Resilient Binary Neural Network
- Authors: Sheng Xu, Yanjing Li, Teli Ma, Mingbao Lin, Hao Dong, Baochang Zhang,
Peng Gao, Jinhu Lv
- Abstract summary: We introduce a Resilient Binary Neural Network (ReBNN) to mitigate the frequent oscillation for better BNNs' training.
Our ReBNN achieves 66.9% Top-1 accuracy with ResNet-18 backbone on the ImageNet dataset.
- Score: 26.63280603795981
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Binary neural networks (BNNs) have received ever-increasing popularity for
their great capability of reducing storage burden as well as quickening
inference time. However, there is a severe performance drop compared with
real-valued networks, due to its intrinsic frequent weight oscillation during
training. In this paper, we introduce a Resilient Binary Neural Network (ReBNN)
to mitigate the frequent oscillation for better BNNs' training. We identify
that the weight oscillation mainly stems from the non-parametric scaling
factor. To address this issue, we propose to parameterize the scaling factor
and introduce a weighted reconstruction loss to build an adaptive training
objective. For the first time, we show that the weight oscillation is
controlled by the balanced parameter attached to the reconstruction loss, which
provides a theoretical foundation to parameterize it in back propagation. Based
on this, we learn our ReBNN by calculating the balanced parameter based on its
maximum magnitude, which can effectively mitigate the weight oscillation with a
resilient training process. Extensive experiments are conducted upon various
network models, such as ResNet and Faster-RCNN for computer vision, as well as
BERT for natural language processing. The results demonstrate the overwhelming
performance of our ReBNN over prior arts. For example, our ReBNN achieves 66.9%
Top-1 accuracy with ResNet-18 backbone on the ImageNet dataset, surpassing
existing state-of-the-arts by a significant margin. Our code is open-sourced at
https://github.com/SteveTsui/ReBNN.
Related papers
- TT-SNN: Tensor Train Decomposition for Efficient Spiking Neural Network
Training [27.565726483503838]
We introduce Train Decomposition for Spiking Neural Networks (TT-SNN)
TT-SNN reduces model size through trainable weight decomposition, resulting in reduced storage, FLOPs, and latency.
We also propose a parallel computation as an alternative to the typical sequential tensor computation.
arXiv Detail & Related papers (2024-01-15T23:08:19Z) - Recurrent Bilinear Optimization for Binary Neural Networks [58.972212365275595]
BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors.
Our work is the first attempt to optimize BNNs from the bilinear perspective.
We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
arXiv Detail & Related papers (2022-09-04T06:45:33Z) - Spatial-Temporal-Fusion BNN: Variational Bayesian Feature Layer [77.78479877473899]
We design a spatial-temporal-fusion BNN for efficiently scaling BNNs to large models.
Compared to vanilla BNNs, our approach can greatly reduce the training time and the number of parameters, which contributes to scale BNNs efficiently.
arXiv Detail & Related papers (2021-12-12T17:13:14Z) - How Do Adam and Training Strategies Help BNNs Optimization? [50.22482900678071]
We show that Adam is better equipped to handle the rugged loss surface of BNNs and reaches a better optimum with higher generalization ability.
We derive a simple training scheme, building on existing Adam-based optimization, which achieves 70.5% top-1 accuracy on the ImageNet dataset.
arXiv Detail & Related papers (2021-06-21T17:59:51Z) - Deep Time Delay Neural Network for Speech Enhancement with Full Data
Learning [60.20150317299749]
This paper proposes a deep time delay neural network (TDNN) for speech enhancement with full data learning.
To make full use of the training data, we propose a full data learning method for speech enhancement.
arXiv Detail & Related papers (2020-11-11T06:32:37Z) - FTBNN: Rethinking Non-linearity for 1-bit CNNs and Going Beyond [23.5996182207431]
We show that binarized convolution process owns an increasing linearity towards the target of minimizing such error, which in turn hampers BNN's discriminative ability.
We re-investigate and tune proper non-linear modules to fix that contradiction, leading to a strong baseline which achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-10-19T08:11:48Z) - A Fully Tensorized Recurrent Neural Network [48.50376453324581]
We introduce a "fully tensorized" RNN architecture which jointly encodes the separate weight matrices within each recurrent cell.
This approach reduces model size by several orders of magnitude, while still maintaining similar or better performance compared to standard RNNs.
arXiv Detail & Related papers (2020-10-08T18:24:12Z) - Widening and Squeezing: Towards Accurate and Efficient QNNs [125.172220129257]
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters.
Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques.
We address this problem by projecting features in original full-precision networks to high-dimensional quantization features.
arXiv Detail & Related papers (2020-02-03T04:11:13Z) - RPR: Random Partition Relaxation for Training; Binary and Ternary Weight
Neural Networks [23.45606380793965]
We present Random Partition Relaxation (RPR), a method for strong quantization of neural networks weight to binary (+1/-1) and ternary (+1/0/-1) values.
We demonstrate binary and ternary-weight networks with accuracies beyond the state-of-the-art for GoogLeNet and competitive performance for ResNet-18 and ResNet-50.
arXiv Detail & Related papers (2020-01-04T15:56:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.