Network Binarization via Contrastive Learning
- URL: http://arxiv.org/abs/2207.02970v1
- Date: Wed, 6 Jul 2022 21:04:53 GMT
- Title: Network Binarization via Contrastive Learning
- Authors: Yuzhang Shang, Dan Xu, Ziliang Zong, Yan Yan
- Abstract summary: We establish a novel contrastive learning framework while training Binary Neural Networks (BNNs)
MI is introduced as the metric to measure the information shared between binary and FP activations.
Results show that our method can be implemented as a pile-up module on existing state-of-the-art binarization methods.
- Score: 16.274341164897827
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural network binarization accelerates deep models by quantizing their
weights and activations into 1-bit. However, there is still a huge performance
gap between Binary Neural Networks (BNNs) and their full-precision (FP)
counterparts. As the quantization error caused by weights binarization has been
reduced in earlier works, the activations binarization becomes the major
obstacle for further improvement of the accuracy. BNN characterises a unique
and interesting structure, where the binary and latent FP activations exist in
the same forward pass (\textit{i.e.} $\text{Binarize}(\mathbf{a}_F) =
\mathbf{a}_B$). To mitigate the information degradation caused by the
binarization operation from FP to binary activations, we establish a novel
contrastive learning framework while training BNNs through the lens of Mutual
Information (MI) maximization. MI is introduced as the metric to measure the
information shared between binary and FP activations, which assists
binarization with contrastive learning. Specifically, the representation
ability of the BNNs is greatly strengthened via pulling the positive pairs with
binary and FP activations from the same input samples, as well as pushing
negative pairs from different samples (the number of negative pairs can be
exponentially large). This benefits the downstream tasks, not only
classification but also segmentation and depth estimation,~\textit{etc}. The
experimental results show that our method can be implemented as a pile-up
module on existing state-of-the-art binarization methods and can remarkably
improve the performance over them on CIFAR-10/100 and ImageNet, in addition to
the great generalization ability on NYUD-v2.
Related papers
- BiPer: Binary Neural Networks using a Periodic Function [17.461853355858022]
Quantized neural networks employ reduced precision representations for both weights and activations.
Binary Neural Networks (BNNs) are the extreme quantization case, representing values with just one bit.
In contrast to current BNN approaches, we propose to employ a binary periodic (BiPer) function during binarization.
arXiv Detail & Related papers (2024-04-01T17:52:17Z) - Basic Binary Convolution Unit for Binarized Image Restoration Network [146.0988597062618]
In this study, we reconsider components in binary convolution, such as residual connection, BatchNorm, activation function, and structure, for image restoration tasks.
Based on our findings and analyses, we design a simple yet efficient basic binary convolution unit (BBCU)
Our BBCU significantly outperforms other BNNs and lightweight models, which shows that BBCU can serve as a basic unit for binarized IR networks.
arXiv Detail & Related papers (2022-10-02T01:54:40Z) - Recurrent Bilinear Optimization for Binary Neural Networks [58.972212365275595]
BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors.
Our work is the first attempt to optimize BNNs from the bilinear perspective.
We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
arXiv Detail & Related papers (2022-09-04T06:45:33Z) - Towards Accurate Binary Neural Networks via Modeling Contextual
Dependencies [52.691032025163175]
Existing Binary Neural Networks (BNNs) operate mainly on local convolutions with binarization function.
We present new designs of binary neural modules, which enables leading binary neural modules by a large margin.
arXiv Detail & Related papers (2022-09-03T11:51:04Z) - Bimodal Distributed Binarized Neural Networks [3.0778860202909657]
Binarization techniques, however, suffer from ineligible performance degradation compared to their full-precision counterparts.
We propose a Bi-Modal Distributed binarization method (methodname)
That imposes bi-modal distribution of the network weights by kurtosis regularization.
arXiv Detail & Related papers (2022-04-05T06:07:05Z) - Distribution-sensitive Information Retention for Accurate Binary Neural
Network [49.971345958676196]
We present a novel Distribution-sensitive Information Retention Network (DIR-Net) to retain the information of the forward activations and backward gradients.
Our DIR-Net consistently outperforms the SOTA binarization approaches under mainstream and compact architectures.
We conduct our DIR-Net on real-world resource-limited devices which achieves 11.1 times storage saving and 5.4 times speedup.
arXiv Detail & Related papers (2021-09-25T10:59:39Z) - Self-Distribution Binary Neural Networks [18.69165083747967]
We study the binary neural networks (BNNs) of which both the weights and activations are binary (i.e., 1-bit representation)
We propose Self-Distribution Binary Neural Network (SD-BNN)
Experiments on CIFAR-10 and ImageNet datasets show that the proposed SD-BNN consistently outperforms the state-of-the-art (SOTA) BNNs.
arXiv Detail & Related papers (2021-03-03T13:39:52Z) - S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural
Networks via Guided Distribution Calibration [74.5509794733707]
We present a novel guided learning paradigm from real-valued to distill binary networks on the final prediction distribution.
Our proposed method can boost the simple contrastive learning baseline by an absolute gain of 5.515% on BNNs.
Our method achieves substantial improvement over the simple contrastive learning baseline, and is even comparable to many mainstream supervised BNN methods.
arXiv Detail & Related papers (2021-02-17T18:59:28Z) - FTBNN: Rethinking Non-linearity for 1-bit CNNs and Going Beyond [23.5996182207431]
We show that binarized convolution process owns an increasing linearity towards the target of minimizing such error, which in turn hampers BNN's discriminative ability.
We re-investigate and tune proper non-linear modules to fix that contradiction, leading to a strong baseline which achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-10-19T08:11:48Z) - Training Binary Neural Networks through Learning with Noisy Supervision [76.26677550127656]
This paper formalizes the binarization operations over neural networks from a learning perspective.
Experimental results on benchmark datasets indicate that the proposed binarization technique attains consistent improvements over baselines.
arXiv Detail & Related papers (2020-10-10T01:59:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.