FTBNN: Rethinking Non-linearity for 1-bit CNNs and Going Beyond
- URL: http://arxiv.org/abs/2010.09294v4
- Date: Wed, 30 Dec 2020 09:48:00 GMT
- Title: FTBNN: Rethinking Non-linearity for 1-bit CNNs and Going Beyond
- Authors: Zhuo Su, Linpu Fang, Deke Guo, Dewen Hu, Matti Pietik\"ainen, Li Liu
- Abstract summary: We show that binarized convolution process owns an increasing linearity towards the target of minimizing such error, which in turn hampers BNN's discriminative ability.
We re-investigate and tune proper non-linear modules to fix that contradiction, leading to a strong baseline which achieves state-of-the-art performance.
- Score: 23.5996182207431
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Binary neural networks (BNNs), where both weights and activations are
binarized into 1 bit, have been widely studied in recent years due to its great
benefit of highly accelerated computation and substantially reduced memory
footprint that appeal to the development of resource constrained devices. In
contrast to previous methods tending to reduce the quantization error for
training BNN structures, we argue that the binarized convolution process owns
an increasing linearity towards the target of minimizing such error, which in
turn hampers BNN's discriminative ability. In this paper, we re-investigate and
tune proper non-linear modules to fix that contradiction, leading to a strong
baseline which achieves state-of-the-art performance on the large-scale
ImageNet dataset in terms of accuracy and training efficiency. To go further,
we find that the proposed BNN model still has much potential to be compressed
by making a better use of the efficient binary operations, without losing
accuracy. In addition, the limited capacity of the BNN model can also be
increased with the help of group execution. Based on these insights, we are
able to improve the baseline with an additional 4~5% top-1 accuracy gain even
with less computational cost. Our code will be made public at
https://github.com/zhuogege1943/ftbnn.
Related papers
- Projected Stochastic Gradient Descent with Quantum Annealed Binary Gradients [51.82488018573326]
We present QP-SBGD, a novel layer-wise optimiser tailored towards training neural networks with binary weights.
BNNs reduce the computational requirements and energy consumption of deep learning models with minimal loss in accuracy.
Our algorithm is implemented layer-wise, making it suitable to train larger networks on resource-limited quantum hardware.
arXiv Detail & Related papers (2023-10-23T17:32:38Z) - Binary domain generalization for sparsifying binary neural networks [3.2462411268263964]
Binary neural networks (BNNs) are an attractive solution for developing and deploying deep neural network (DNN)-based applications in resource constrained devices.
Weight pruning of BNNs leads to performance degradation, which suggests that the standard binarization domain of BNNs is not well adapted for the task.
This work proposes a novel more general binary domain that extends the standard binary one that is more robust to pruning techniques.
arXiv Detail & Related papers (2023-06-23T14:32:16Z) - Compacting Binary Neural Networks by Sparse Kernel Selection [58.84313343190488]
This paper is motivated by a previously revealed phenomenon that the binary kernels in successful BNNs are nearly power-law distributed.
We develop the Permutation Straight-Through Estimator (PSTE) that is able to not only optimize the selection process end-to-end but also maintain the non-repetitive occupancy of selected codewords.
Experiments verify that our method reduces both the model size and bit-wise computational costs, and achieves accuracy improvements compared with state-of-the-art BNNs under comparable budgets.
arXiv Detail & Related papers (2023-03-25T13:53:02Z) - Recurrent Bilinear Optimization for Binary Neural Networks [58.972212365275595]
BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors.
Our work is the first attempt to optimize BNNs from the bilinear perspective.
We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
arXiv Detail & Related papers (2022-09-04T06:45:33Z) - Elastic-Link for Binarized Neural Network [9.83865304744923]
"Elastic-Link" (EL) module enrich information flow within a BNN by adaptively adding real-valued input features to the subsequent convolutional output features.
EL produces a significant improvement on the challenging large-scale ImageNet dataset.
With the integration of ReActNet, it yields a new state-of-the-art result of 71.9% top-1 accuracy.
arXiv Detail & Related papers (2021-12-19T13:49:29Z) - Sub-bit Neural Networks: Learning to Compress and Accelerate Binary
Neural Networks [72.81092567651395]
Sub-bit Neural Networks (SNNs) are a new type of binary quantization design tailored to compress and accelerate BNNs.
SNNs are trained with a kernel-aware optimization framework, which exploits binary quantization in the fine-grained convolutional kernel space.
Experiments on visual recognition benchmarks and the hardware deployment on FPGA validate the great potentials of SNNs.
arXiv Detail & Related papers (2021-10-18T11:30:29Z) - S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural
Networks via Guided Distribution Calibration [74.5509794733707]
We present a novel guided learning paradigm from real-valued to distill binary networks on the final prediction distribution.
Our proposed method can boost the simple contrastive learning baseline by an absolute gain of 5.515% on BNNs.
Our method achieves substantial improvement over the simple contrastive learning baseline, and is even comparable to many mainstream supervised BNN methods.
arXiv Detail & Related papers (2021-02-17T18:59:28Z) - FracBNN: Accurate and FPGA-Efficient Binary Neural Networks with
Fractional Activations [20.218382369944152]
Binary neural networks (BNNs) have 1-bit weights and activations.
BNNs tend to produce a much lower accuracy on realistic datasets such as ImageNet.
This work proposes FracBNN, which exploits fractional activations to substantially improve the accuracy of BNNs.
arXiv Detail & Related papers (2020-12-22T17:49:30Z) - Distillation Guided Residual Learning for Binary Convolutional Neural
Networks [83.6169936912264]
It is challenging to bridge the performance gap between Binary CNN (BCNN) and Floating point CNN (FCNN)
We observe that, this performance gap leads to substantial residuals between intermediate feature maps of BCNN and FCNN.
To minimize the performance gap, we enforce BCNN to produce similar intermediate feature maps with the ones of FCNN.
This training strategy, i.e., optimizing each binary convolutional block with block-wise distillation loss derived from FCNN, leads to a more effective optimization to BCNN.
arXiv Detail & Related papers (2020-07-10T07:55:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.