Dynamic Binary Neural Network by learning channel-wise thresholds
- URL: http://arxiv.org/abs/2110.05185v1
- Date: Fri, 8 Oct 2021 17:41:36 GMT
- Title: Dynamic Binary Neural Network by learning channel-wise thresholds
- Authors: Jiehua Zhang, Zhuo Su, Yanghe Feng, Xin Lu, Matti Pietik\"ainen, Li
Liu
- Abstract summary: We propose a dynamic BNN (DyBNN) incorporating dynamic learnable channel-wise thresholds of Sign function and shift parameters of PReLU.
The DyBNN based on two backbones of ReActNet (MobileNetV1 and ResNet18) achieve 71.2% and 67.4% top1-accuracy on ImageNet dataset.
- Score: 9.432747511001246
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Binary neural networks (BNNs) constrain weights and activations to +1 or -1
with limited storage and computational cost, which is hardware-friendly for
portable devices. Recently, BNNs have achieved remarkable progress and been
adopted into various fields. However, the performance of BNNs is sensitive to
activation distribution. The existing BNNs utilized the Sign function with
predefined or learned static thresholds to binarize activations. This process
limits representation capacity of BNNs since different samples may adapt to
unequal thresholds. To address this problem, we propose a dynamic BNN (DyBNN)
incorporating dynamic learnable channel-wise thresholds of Sign function and
shift parameters of PReLU. The method aggregates the global information into
the hyper function and effectively increases the feature expression ability.
The experimental results prove that our method is an effective and
straightforward way to reduce information loss and enhance performance of BNNs.
The DyBNN based on two backbones of ReActNet (MobileNetV1 and ResNet18) achieve
71.2% and 67.4% top1-accuracy on ImageNet dataset, outperforming baselines by a
large margin (i.e., 1.8% and 1.5% respectively).
Related papers
- NAS-BNN: Neural Architecture Search for Binary Neural Networks [55.058512316210056]
We propose a novel neural architecture search scheme for binary neural networks, named NAS-BNN.
Our discovered binary model family outperforms previous BNNs for a wide range of operations (OPs) from 20M to 200M.
In addition, we validate the transferability of these searched BNNs on the object detection task, and our binary detectors with the searched BNNs achieve a novel state-of-the-art result, e.g., 31.6% mAP with 370M OPs, on MS dataset.
arXiv Detail & Related papers (2024-08-28T02:17:58Z) - Boosting Binary Neural Networks via Dynamic Thresholds Learning [21.835748440099586]
We introduce DySign to reduce information loss and boost representative capacity of BNNs.
For DCNNs, DyBCNNs based on two backbones achieve 71.2% and 67.4% top1-accuracy on ImageNet dataset.
For ViTs, DyCCT presents the superiority of the convolutional embedding layer in fully binarized ViTs and 56.1% on the ImageNet dataset.
arXiv Detail & Related papers (2022-11-04T07:18:21Z) - Recurrent Bilinear Optimization for Binary Neural Networks [58.972212365275595]
BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors.
Our work is the first attempt to optimize BNNs from the bilinear perspective.
We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
arXiv Detail & Related papers (2022-09-04T06:45:33Z) - INSTA-BNN: Binary Neural Network with INSTAnce-aware Threshold [16.890849856271185]
We propose a novel BNN design called Binary Neural Network with INSTAnce-aware threshold (INSTA-BNN)
INSTA-BNN controls the quantization threshold dynamically in an input-dependent or instance-aware manner.
Our study shows that INSTA-BNN outperforms the baseline by 3.0% and 2.8% on the ImageNet classification task with comparable computing cost.
arXiv Detail & Related papers (2022-04-15T12:30:02Z) - Elastic-Link for Binarized Neural Network [9.83865304744923]
"Elastic-Link" (EL) module enrich information flow within a BNN by adaptively adding real-valued input features to the subsequent convolutional output features.
EL produces a significant improvement on the challenging large-scale ImageNet dataset.
With the integration of ReActNet, it yields a new state-of-the-art result of 71.9% top-1 accuracy.
arXiv Detail & Related papers (2021-12-19T13:49:29Z) - Spatial-Temporal-Fusion BNN: Variational Bayesian Feature Layer [77.78479877473899]
We design a spatial-temporal-fusion BNN for efficiently scaling BNNs to large models.
Compared to vanilla BNNs, our approach can greatly reduce the training time and the number of parameters, which contributes to scale BNNs efficiently.
arXiv Detail & Related papers (2021-12-12T17:13:14Z) - Sub-bit Neural Networks: Learning to Compress and Accelerate Binary
Neural Networks [72.81092567651395]
Sub-bit Neural Networks (SNNs) are a new type of binary quantization design tailored to compress and accelerate BNNs.
SNNs are trained with a kernel-aware optimization framework, which exploits binary quantization in the fine-grained convolutional kernel space.
Experiments on visual recognition benchmarks and the hardware deployment on FPGA validate the great potentials of SNNs.
arXiv Detail & Related papers (2021-10-18T11:30:29Z) - "BNN - BN = ?": Training Binary Neural Networks without Batch
Normalization [92.23297927690149]
Batch normalization (BN) is a key facilitator and considered essential for state-of-the-art binary neural networks (BNN)
We extend their framework to training BNNs, and for the first time demonstrate that BNs can be completed removed from BNN training and inference regimes.
arXiv Detail & Related papers (2021-04-16T16:46:57Z) - Self-Distribution Binary Neural Networks [18.69165083747967]
We study the binary neural networks (BNNs) of which both the weights and activations are binary (i.e., 1-bit representation)
We propose Self-Distribution Binary Neural Network (SD-BNN)
Experiments on CIFAR-10 and ImageNet datasets show that the proposed SD-BNN consistently outperforms the state-of-the-art (SOTA) BNNs.
arXiv Detail & Related papers (2021-03-03T13:39:52Z) - S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural
Networks via Guided Distribution Calibration [74.5509794733707]
We present a novel guided learning paradigm from real-valued to distill binary networks on the final prediction distribution.
Our proposed method can boost the simple contrastive learning baseline by an absolute gain of 5.515% on BNNs.
Our method achieves substantial improvement over the simple contrastive learning baseline, and is even comparable to many mainstream supervised BNN methods.
arXiv Detail & Related papers (2021-02-17T18:59:28Z) - FTBNN: Rethinking Non-linearity for 1-bit CNNs and Going Beyond [23.5996182207431]
We show that binarized convolution process owns an increasing linearity towards the target of minimizing such error, which in turn hampers BNN's discriminative ability.
We re-investigate and tune proper non-linear modules to fix that contradiction, leading to a strong baseline which achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-10-19T08:11:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.