Training Binary Neural Networks with Real-to-Binary Convolutions
- URL: http://arxiv.org/abs/2003.11535v1
- Date: Wed, 25 Mar 2020 17:54:38 GMT
- Title: Training Binary Neural Networks with Real-to-Binary Convolutions
- Authors: Brais Martinez and Jing Yang and Adrian Bulat and Georgios
Tzimiropoulos
- Abstract summary: We show how to train binary networks to within a few percent points of the full precision counterpart.
We show how to build a strong baseline, which already achieves state-of-the-art accuracy.
We show that, when putting all of our improvements together, the proposed model beats the current state of the art by more than 5% top-1 accuracy on ImageNet.
- Score: 52.91164959767517
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper shows how to train binary networks to within a few percent points
($\sim 3-5 \%$) of the full precision counterpart. We first show how to build a
strong baseline, which already achieves state-of-the-art accuracy, by combining
recently proposed advances and carefully adjusting the optimization procedure.
Secondly, we show that by attempting to minimize the discrepancy between the
output of the binary and the corresponding real-valued convolution, additional
significant accuracy gains can be obtained. We materialize this idea in two
complementary ways: (1) with a loss function, during training, by matching the
spatial attention maps computed at the output of the binary and real-valued
convolutions, and (2) in a data-driven manner, by using the real-valued
activations, available during inference prior to the binarization process, for
re-scaling the activations right after the binary convolution. Finally, we show
that, when putting all of our improvements together, the proposed model beats
the current state of the art by more than 5% top-1 accuracy on ImageNet and
reduces the gap to its real-valued counterpart to less than 3% and 5% top-1
accuracy on CIFAR-100 and ImageNet respectively when using a ResNet-18
architecture. Code available at https://github.com/brais-martinez/real2binary.
Related papers
- Input Layer Binarization with Bit-Plane Encoding [4.872439392746007]
We present a new method to binarize the first layer using directly the 8-bit representation of input data.
The resulting model is fully binarized and our first layer binarization approach is model independent.
arXiv Detail & Related papers (2023-05-04T14:49:07Z) - Join the High Accuracy Club on ImageNet with A Binary Neural Network
Ticket [10.552465253379134]
We focus on a problem: how can a binary neural network achieve the crucial accuracy level (e.g., 80%) on ILSVRC-2012 ImageNet?
We design a novel binary architecture BNext based on a comprehensive study of binary architectures and their optimization process.
We propose a novel knowledge-distillation technique to alleviate the counter-intuitive overfitting problem observed when attempting to train extremely accurate binary models.
arXiv Detail & Related papers (2022-11-23T13:08:58Z) - BiFSMNv2: Pushing Binary Neural Networks for Keyword Spotting to
Real-Network Performance [54.214426436283134]
Deep neural networks, such as the Deep-FSMN, have been widely studied for keyword spotting (KWS) applications.
We present a strong yet efficient binary neural network for KWS, namely BiFSMNv2, pushing it to the real-network accuracy performance.
We highlight that benefiting from the compact architecture and optimized hardware kernel, BiFSMNv2 can achieve an impressive 25.1x speedup and 20.2x storage-saving on edge hardware.
arXiv Detail & Related papers (2022-11-13T18:31:45Z) - AdaBin: Improving Binary Neural Networks with Adaptive Binary Sets [27.022212653067367]
This paper studies the Binary Neural Networks (BNNs) in which weights and activations are both binarized into 1-bit values.
We present a simple yet effective approach called AdaBin to adaptively obtain the optimal binary sets.
Experimental results on benchmark models and datasets demonstrate that the proposed AdaBin is able to achieve state-of-the-art performance.
arXiv Detail & Related papers (2022-08-17T05:43:33Z) - Distribution-sensitive Information Retention for Accurate Binary Neural
Network [49.971345958676196]
We present a novel Distribution-sensitive Information Retention Network (DIR-Net) to retain the information of the forward activations and backward gradients.
Our DIR-Net consistently outperforms the SOTA binarization approaches under mainstream and compact architectures.
We conduct our DIR-Net on real-world resource-limited devices which achieves 11.1 times storage saving and 5.4 times speedup.
arXiv Detail & Related papers (2021-09-25T10:59:39Z) - Towards Lossless Binary Convolutional Neural Networks Using Piecewise
Approximation [4.023728681102073]
CNNs can significantly reduce the number of arithmetic operations and the size of memory storage.
However, the accuracy degradation of single and multiple binary CNNs is unacceptable for modern architectures.
We propose a Piecewise Approximation scheme for multiple binary CNNs which lessens accuracy loss by approximating full precision weights and activations.
arXiv Detail & Related papers (2020-08-08T13:32:33Z) - Distillation Guided Residual Learning for Binary Convolutional Neural
Networks [83.6169936912264]
It is challenging to bridge the performance gap between Binary CNN (BCNN) and Floating point CNN (FCNN)
We observe that, this performance gap leads to substantial residuals between intermediate feature maps of BCNN and FCNN.
To minimize the performance gap, we enforce BCNN to produce similar intermediate feature maps with the ones of FCNN.
This training strategy, i.e., optimizing each binary convolutional block with block-wise distillation loss derived from FCNN, leads to a more effective optimization to BCNN.
arXiv Detail & Related papers (2020-07-10T07:55:39Z) - Binarizing MobileNet via Evolution-based Searching [66.94247681870125]
We propose a use of evolutionary search to facilitate the construction and training scheme when binarizing MobileNet.
Inspired by one-shot architecture search frameworks, we manipulate the idea of group convolution to design efficient 1-Bit Convolutional Neural Networks (CNNs)
Our objective is to come up with a tiny yet efficient binary neural architecture by exploring the best candidates of the group convolution.
arXiv Detail & Related papers (2020-05-13T13:25:51Z) - ReActNet: Towards Precise Binary Neural Network with Generalized
Activation Functions [76.05981545084738]
We propose several ideas for enhancing a binary network to close its accuracy gap from real-valued networks without incurring any additional computational cost.
We first construct a baseline network by modifying and binarizing a compact real-valued network with parameter-free shortcuts.
We show that the proposed ReActNet outperforms all the state-of-the-arts by a large margin.
arXiv Detail & Related papers (2020-03-07T02:12:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.