Recurrent Bilinear Optimization for Binary Neural Networks
- URL: http://arxiv.org/abs/2209.01542v1
- Date: Sun, 4 Sep 2022 06:45:33 GMT
- Title: Recurrent Bilinear Optimization for Binary Neural Networks
- Authors: Sheng Xu, Yanjing Li, Tiancheng Wang, Teli Ma, Baochang Zhang, Peng
Gao, Yu Qiao, Jinhu Lv and Guodong Guo
- Abstract summary: BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors.
Our work is the first attempt to optimize BNNs from the bilinear perspective.
We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
- Score: 58.972212365275595
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Binary Neural Networks (BNNs) show great promise for real-world embedded
devices. As one of the critical steps to achieve a powerful BNN, the scale
factor calculation plays an essential role in reducing the performance gap to
their real-valued counterparts. However, existing BNNs neglect the intrinsic
bilinear relationship of real-valued weights and scale factors, resulting in a
sub-optimal model caused by an insufficient training process. To address this
issue, Recurrent Bilinear Optimization is proposed to improve the learning
process of BNNs (RBONNs) by associating the intrinsic bilinear variables in the
back propagation process. Our work is the first attempt to optimize BNNs from
the bilinear perspective. Specifically, we employ a recurrent optimization and
Density-ReLU to sequentially backtrack the sparse real-valued weight filters,
which will be sufficiently trained and reach their performance limits based on
a controllable learning process. We obtain robust RBONNs, which show impressive
performance over state-of-the-art BNNs on various models and datasets.
Particularly, on the task of object detection, RBONNs have great generalization
performance. Our code is open-sourced on https://github.com/SteveTsui/RBONN .
Related papers
- Compacting Binary Neural Networks by Sparse Kernel Selection [58.84313343190488]
This paper is motivated by a previously revealed phenomenon that the binary kernels in successful BNNs are nearly power-law distributed.
We develop the Permutation Straight-Through Estimator (PSTE) that is able to not only optimize the selection process end-to-end but also maintain the non-repetitive occupancy of selected codewords.
Experiments verify that our method reduces both the model size and bit-wise computational costs, and achieves accuracy improvements compared with state-of-the-art BNNs under comparable budgets.
arXiv Detail & Related papers (2023-03-25T13:53:02Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Elastic-Link for Binarized Neural Network [9.83865304744923]
"Elastic-Link" (EL) module enrich information flow within a BNN by adaptively adding real-valued input features to the subsequent convolutional output features.
EL produces a significant improvement on the challenging large-scale ImageNet dataset.
With the integration of ReActNet, it yields a new state-of-the-art result of 71.9% top-1 accuracy.
arXiv Detail & Related papers (2021-12-19T13:49:29Z) - Sub-bit Neural Networks: Learning to Compress and Accelerate Binary
Neural Networks [72.81092567651395]
Sub-bit Neural Networks (SNNs) are a new type of binary quantization design tailored to compress and accelerate BNNs.
SNNs are trained with a kernel-aware optimization framework, which exploits binary quantization in the fine-grained convolutional kernel space.
Experiments on visual recognition benchmarks and the hardware deployment on FPGA validate the great potentials of SNNs.
arXiv Detail & Related papers (2021-10-18T11:30:29Z) - How Do Adam and Training Strategies Help BNNs Optimization? [50.22482900678071]
We show that Adam is better equipped to handle the rugged loss surface of BNNs and reaches a better optimum with higher generalization ability.
We derive a simple training scheme, building on existing Adam-based optimization, which achieves 70.5% top-1 accuracy on the ImageNet dataset.
arXiv Detail & Related papers (2021-06-21T17:59:51Z) - A Bop and Beyond: A Second Order Optimizer for Binarized Neural Networks [0.0]
optimization of Binary Neural Networks (BNNs) relies on approximating the real-valued weights with their binarized representations.
In this paper, we take an approach parallel to Adam which also uses the second raw moment estimate to normalize the first raw moment before doing the comparison with the threshold.
We present two versions of the proposed: a biased one and a bias-corrected one, each with its own applications.
arXiv Detail & Related papers (2021-04-11T22:20:09Z) - S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural
Networks via Guided Distribution Calibration [74.5509794733707]
We present a novel guided learning paradigm from real-valued to distill binary networks on the final prediction distribution.
Our proposed method can boost the simple contrastive learning baseline by an absolute gain of 5.515% on BNNs.
Our method achieves substantial improvement over the simple contrastive learning baseline, and is even comparable to many mainstream supervised BNN methods.
arXiv Detail & Related papers (2021-02-17T18:59:28Z) - BiSNN: Training Spiking Neural Networks with Binary Weights via Bayesian
Learning [37.376989855065545]
Spiking Neural Networks (SNNs) are biologically inspired, dynamic, event-driven models that enhance energy efficiency.
An SNN model is introduced that combines the benefits of temporally sparse binary activations and of binary weights.
Experiments validate the performance loss with respect to full-precision implementations.
arXiv Detail & Related papers (2020-12-15T14:06:36Z) - FTBNN: Rethinking Non-linearity for 1-bit CNNs and Going Beyond [23.5996182207431]
We show that binarized convolution process owns an increasing linearity towards the target of minimizing such error, which in turn hampers BNN's discriminative ability.
We re-investigate and tune proper non-linear modules to fix that contradiction, leading to a strong baseline which achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-10-19T08:11:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.