FracBNN: Accurate and FPGA-Efficient Binary Neural Networks with
Fractional Activations
- URL: http://arxiv.org/abs/2012.12206v1
- Date: Tue, 22 Dec 2020 17:49:30 GMT
- Title: FracBNN: Accurate and FPGA-Efficient Binary Neural Networks with
Fractional Activations
- Authors: Yichi Zhang and Junhao Pan and Xinheng Liu and Hongzheng Chen and
Deming Chen and Zhiru Zhang
- Abstract summary: Binary neural networks (BNNs) have 1-bit weights and activations.
BNNs tend to produce a much lower accuracy on realistic datasets such as ImageNet.
This work proposes FracBNN, which exploits fractional activations to substantially improve the accuracy of BNNs.
- Score: 20.218382369944152
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Binary neural networks (BNNs) have 1-bit weights and activations. Such
networks are well suited for FPGAs, as their dominant computations are bitwise
arithmetic and the memory requirement is also significantly reduced. However,
compared to start-of-the-art compact convolutional neural network (CNN) models,
BNNs tend to produce a much lower accuracy on realistic datasets such as
ImageNet. In addition, the input layer of BNNs has gradually become a major
compute bottleneck, because it is conventionally excluded from binarization to
avoid a large accuracy loss. This work proposes FracBNN, which exploits
fractional activations to substantially improve the accuracy of BNNs.
Specifically, our approach employs a dual-precision activation scheme to
compute features with up to two bits, using an additional sparse binary
convolution. We further binarize the input layer using a novel thermometer
encoding. Overall, FracBNN preserves the key benefits of conventional BNNs,
where all convolutional layers are computed in pure binary MAC operations
(BMACs). We design an efficient FPGA-based accelerator for our novel BNN model
that supports the fractional activations. To evaluate the performance of
FracBNN under a resource-constrained scenario, we implement the entire
optimized network architecture on an embedded FPGA (Xilinx Ultra96v2). Our
experiments on ImageNet show that FracBNN achieves an accuracy comparable to
MobileNetV2, surpassing the best-known BNN design on FPGAs with an increase of
28.9% in top-1 accuracy and a 2.5x reduction in model size. FracBNN also
outperforms a recently introduced BNN model with an increase of 2.4% in top-1
accuracy while using the same model size. On the embedded FPGA device, FracBNN
demonstrates the ability of real-time image classification.
Related papers
- NAS-BNN: Neural Architecture Search for Binary Neural Networks [55.058512316210056]
We propose a novel neural architecture search scheme for binary neural networks, named NAS-BNN.
Our discovered binary model family outperforms previous BNNs for a wide range of operations (OPs) from 20M to 200M.
In addition, we validate the transferability of these searched BNNs on the object detection task, and our binary detectors with the searched BNNs achieve a novel state-of-the-art result, e.g., 31.6% mAP with 370M OPs, on MS dataset.
arXiv Detail & Related papers (2024-08-28T02:17:58Z) - Binary domain generalization for sparsifying binary neural networks [3.2462411268263964]
Binary neural networks (BNNs) are an attractive solution for developing and deploying deep neural network (DNN)-based applications in resource constrained devices.
Weight pruning of BNNs leads to performance degradation, which suggests that the standard binarization domain of BNNs is not well adapted for the task.
This work proposes a novel more general binary domain that extends the standard binary one that is more robust to pruning techniques.
arXiv Detail & Related papers (2023-06-23T14:32:16Z) - An Optical XNOR-Bitcount Based Accelerator for Efficient Inference of
Binary Neural Networks [0.0]
We invent a single-MRR-based optical XNOR gate (OXG)
We present a novel design of bitcount circuit which we refer to as Photo-Charge Accumulator (PCA)
Our evaluation for the inference of four modern BNNs indicates that OXBNN provides improvements of up to 62x and 7.6x in frames-per-second (FPS) and FPS/W (energy efficiency)
arXiv Detail & Related papers (2023-02-03T20:56:01Z) - Recurrent Bilinear Optimization for Binary Neural Networks [58.972212365275595]
BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors.
Our work is the first attempt to optimize BNNs from the bilinear perspective.
We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
arXiv Detail & Related papers (2022-09-04T06:45:33Z) - Elastic-Link for Binarized Neural Network [9.83865304744923]
"Elastic-Link" (EL) module enrich information flow within a BNN by adaptively adding real-valued input features to the subsequent convolutional output features.
EL produces a significant improvement on the challenging large-scale ImageNet dataset.
With the integration of ReActNet, it yields a new state-of-the-art result of 71.9% top-1 accuracy.
arXiv Detail & Related papers (2021-12-19T13:49:29Z) - Sub-bit Neural Networks: Learning to Compress and Accelerate Binary
Neural Networks [72.81092567651395]
Sub-bit Neural Networks (SNNs) are a new type of binary quantization design tailored to compress and accelerate BNNs.
SNNs are trained with a kernel-aware optimization framework, which exploits binary quantization in the fine-grained convolutional kernel space.
Experiments on visual recognition benchmarks and the hardware deployment on FPGA validate the great potentials of SNNs.
arXiv Detail & Related papers (2021-10-18T11:30:29Z) - Quantized Neural Networks via {-1, +1} Encoding Decomposition and
Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks.
We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z) - FTBNN: Rethinking Non-linearity for 1-bit CNNs and Going Beyond [23.5996182207431]
We show that binarized convolution process owns an increasing linearity towards the target of minimizing such error, which in turn hampers BNN's discriminative ability.
We re-investigate and tune proper non-linear modules to fix that contradiction, leading to a strong baseline which achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-10-19T08:11:48Z) - FATNN: Fast and Accurate Ternary Neural Networks [89.07796377047619]
Ternary Neural Networks (TNNs) have received much attention due to being potentially orders of magnitude faster in inference, as well as more power efficient, than full-precision counterparts.
In this work, we show that, under some mild constraints, computational complexity of the ternary inner product can be reduced by a factor of 2.
We elaborately design an implementation-dependent ternary quantization algorithm to mitigate the performance gap.
arXiv Detail & Related papers (2020-08-12T04:26:18Z) - Distillation Guided Residual Learning for Binary Convolutional Neural
Networks [83.6169936912264]
It is challenging to bridge the performance gap between Binary CNN (BCNN) and Floating point CNN (FCNN)
We observe that, this performance gap leads to substantial residuals between intermediate feature maps of BCNN and FCNN.
To minimize the performance gap, we enforce BCNN to produce similar intermediate feature maps with the ones of FCNN.
This training strategy, i.e., optimizing each binary convolutional block with block-wise distillation loss derived from FCNN, leads to a more effective optimization to BCNN.
arXiv Detail & Related papers (2020-07-10T07:55:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.