Binary Neural Networks as a general-propose compute paradigm for
on-device computer vision
- URL: http://arxiv.org/abs/2202.03716v1
- Date: Tue, 8 Feb 2022 08:38:22 GMT
- Title: Binary Neural Networks as a general-propose compute paradigm for
on-device computer vision
- Authors: Guhong Nie (1), Lirui Xiao (1), Menglong Zhu (1), Dongliang Chu (1),
Yue Shen (1), Peng Li (1), Kang Yang (1), Li Du (2) and Bo Chen (1) ((1) DJI
Innovations Inc, (2) School of Electronic Science and Engineering, Nanjing
University)
- Abstract summary: We propose a BNN framework comprising 1) a minimalistic inference scheme for hardware-friendliness, 2) an over- parameterized training scheme for high accuracy, and 3) a simple procedure to adapt to different vision tasks.
The resultant framework overtakes 8-bit quantization in the speed-vs-accuracy tradeoff for classification, detection, segmentation, super-resolution and matching.
Our BNNs promise 2.8-7$times$ fewer execution cycles than 8-bit and 2.1-2.7$times$ fewer cycles than alternative BNN designs.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: For binary neural networks (BNNs) to become the mainstream on-device computer
vision algorithm, they must achieve a superior speed-vs-accuracy tradeoff than
8-bit quantization and establish a similar degree of general applicability in
vision tasks. To this end, we propose a BNN framework comprising 1) a
minimalistic inference scheme for hardware-friendliness, 2) an
over-parameterized training scheme for high accuracy, and 3) a simple procedure
to adapt to different vision tasks. The resultant framework overtakes 8-bit
quantization in the speed-vs-accuracy tradeoff for classification, detection,
segmentation, super-resolution and matching: our BNNs not only retain the
accuracy levels of their 8-bit baselines but also showcase 1.3-2.4$\times$
faster FPS on mobile CPUs. Similar conclusions can be drawn for prototypical
systolic-array-based AI accelerators, where our BNNs promise 2.8-7$\times$
fewer execution cycles than 8-bit and 2.1-2.7$\times$ fewer cycles than
alternative BNN designs. These results suggest that the time for large-scale
BNN adoption could be upon us.
Related papers
- Compacting Binary Neural Networks by Sparse Kernel Selection [58.84313343190488]
This paper is motivated by a previously revealed phenomenon that the binary kernels in successful BNNs are nearly power-law distributed.
We develop the Permutation Straight-Through Estimator (PSTE) that is able to not only optimize the selection process end-to-end but also maintain the non-repetitive occupancy of selected codewords.
Experiments verify that our method reduces both the model size and bit-wise computational costs, and achieves accuracy improvements compared with state-of-the-art BNNs under comparable budgets.
arXiv Detail & Related papers (2023-03-25T13:53:02Z) - An Optical XNOR-Bitcount Based Accelerator for Efficient Inference of
Binary Neural Networks [0.0]
We invent a single-MRR-based optical XNOR gate (OXG)
We present a novel design of bitcount circuit which we refer to as Photo-Charge Accumulator (PCA)
Our evaluation for the inference of four modern BNNs indicates that OXBNN provides improvements of up to 62x and 7.6x in frames-per-second (FPS) and FPS/W (energy efficiency)
arXiv Detail & Related papers (2023-02-03T20:56:01Z) - Recurrent Bilinear Optimization for Binary Neural Networks [58.972212365275595]
BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors.
Our work is the first attempt to optimize BNNs from the bilinear perspective.
We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
arXiv Detail & Related papers (2022-09-04T06:45:33Z) - Sub-bit Neural Networks: Learning to Compress and Accelerate Binary
Neural Networks [72.81092567651395]
Sub-bit Neural Networks (SNNs) are a new type of binary quantization design tailored to compress and accelerate BNNs.
SNNs are trained with a kernel-aware optimization framework, which exploits binary quantization in the fine-grained convolutional kernel space.
Experiments on visual recognition benchmarks and the hardware deployment on FPGA validate the great potentials of SNNs.
arXiv Detail & Related papers (2021-10-18T11:30:29Z) - S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural
Networks via Guided Distribution Calibration [74.5509794733707]
We present a novel guided learning paradigm from real-valued to distill binary networks on the final prediction distribution.
Our proposed method can boost the simple contrastive learning baseline by an absolute gain of 5.515% on BNNs.
Our method achieves substantial improvement over the simple contrastive learning baseline, and is even comparable to many mainstream supervised BNN methods.
arXiv Detail & Related papers (2021-02-17T18:59:28Z) - Binary Graph Neural Networks [69.51765073772226]
Graph Neural Networks (GNNs) have emerged as a powerful and flexible framework for representation learning on irregular data.
In this paper, we present and evaluate different strategies for the binarization of graph neural networks.
We show that through careful design of the models, and control of the training process, binary graph neural networks can be trained at only a moderate cost in accuracy on challenging benchmarks.
arXiv Detail & Related papers (2020-12-31T18:48:58Z) - FracBNN: Accurate and FPGA-Efficient Binary Neural Networks with
Fractional Activations [20.218382369944152]
Binary neural networks (BNNs) have 1-bit weights and activations.
BNNs tend to produce a much lower accuracy on realistic datasets such as ImageNet.
This work proposes FracBNN, which exploits fractional activations to substantially improve the accuracy of BNNs.
arXiv Detail & Related papers (2020-12-22T17:49:30Z) - FTBNN: Rethinking Non-linearity for 1-bit CNNs and Going Beyond [23.5996182207431]
We show that binarized convolution process owns an increasing linearity towards the target of minimizing such error, which in turn hampers BNN's discriminative ability.
We re-investigate and tune proper non-linear modules to fix that contradiction, leading to a strong baseline which achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-10-19T08:11:48Z) - FATNN: Fast and Accurate Ternary Neural Networks [89.07796377047619]
Ternary Neural Networks (TNNs) have received much attention due to being potentially orders of magnitude faster in inference, as well as more power efficient, than full-precision counterparts.
In this work, we show that, under some mild constraints, computational complexity of the ternary inner product can be reduced by a factor of 2.
We elaborately design an implementation-dependent ternary quantization algorithm to mitigate the performance gap.
arXiv Detail & Related papers (2020-08-12T04:26:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.