S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural
Networks via Guided Distribution Calibration
- URL: http://arxiv.org/abs/2102.08946v1
- Date: Wed, 17 Feb 2021 18:59:28 GMT
- Title: S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural
Networks via Guided Distribution Calibration
- Authors: Zhiqiang Shen and Zechun Liu and Jie Qin and Lei Huang and Kwang-Ting
Cheng and Marios Savvides
- Abstract summary: We present a novel guided learning paradigm from real-valued to distill binary networks on the final prediction distribution.
Our proposed method can boost the simple contrastive learning baseline by an absolute gain of 5.515% on BNNs.
Our method achieves substantial improvement over the simple contrastive learning baseline, and is even comparable to many mainstream supervised BNN methods.
- Score: 74.5509794733707
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Previous studies dominantly target at self-supervised learning on real-valued
networks and have achieved many promising results. However, on the more
challenging binary neural networks (BNNs), this task has not yet been fully
explored in the community. In this paper, we focus on this more difficult
scenario: learning networks where both weights and activations are binary,
meanwhile, without any human annotated labels. We observe that the commonly
used contrastive objective is not satisfying on BNNs for competitive accuracy,
since the backbone network contains relatively limited capacity and
representation ability. Hence instead of directly applying existing
self-supervised methods, which cause a severe decline in performance, we
present a novel guided learning paradigm from real-valued to distill binary
networks on the final prediction distribution, to minimize the loss and obtain
desirable accuracy. Our proposed method can boost the simple contrastive
learning baseline by an absolute gain of 5.5~15% on BNNs. We further reveal
that it is difficult for BNNs to recover the similar predictive distributions
as real-valued models when training without labels. Thus, how to calibrate them
is key to address the degradation in performance. Extensive experiments are
conducted on the large-scale ImageNet and downstream datasets. Our method
achieves substantial improvement over the simple contrastive learning baseline,
and is even comparable to many mainstream supervised BNN methods. Code will be
made available.
Related papers
- Make Me a BNN: A Simple Strategy for Estimating Bayesian Uncertainty
from Pre-trained Models [40.38541033389344]
Deep Neural Networks (DNNs) are powerful tools for various computer vision tasks, yet they often struggle with reliable uncertainty quantification.
We introduce the Adaptable Bayesian Neural Network (ABNN), a simple and scalable strategy to seamlessly transform DNNs into BNNs.
We conduct extensive experiments across multiple datasets for image classification and semantic segmentation tasks, and our results demonstrate that ABNN achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-12-23T16:39:24Z) - Recurrent Bilinear Optimization for Binary Neural Networks [58.972212365275595]
BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors.
Our work is the first attempt to optimize BNNs from the bilinear perspective.
We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
arXiv Detail & Related papers (2022-09-04T06:45:33Z) - Comparative Analysis of Interval Reachability for Robust Implicit and
Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs)
INNs are a class of implicit learning models that use implicit equations as layers.
We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z) - Self-Ensembling GAN for Cross-Domain Semantic Segmentation [107.27377745720243]
This paper proposes a self-ensembling generative adversarial network (SE-GAN) exploiting cross-domain data for semantic segmentation.
In SE-GAN, a teacher network and a student network constitute a self-ensembling model for generating semantic segmentation maps, which together with a discriminator, forms a GAN.
Despite its simplicity, we find SE-GAN can significantly boost the performance of adversarial training and enhance the stability of the model.
arXiv Detail & Related papers (2021-12-15T09:50:25Z) - Spatial-Temporal-Fusion BNN: Variational Bayesian Feature Layer [77.78479877473899]
We design a spatial-temporal-fusion BNN for efficiently scaling BNNs to large models.
Compared to vanilla BNNs, our approach can greatly reduce the training time and the number of parameters, which contributes to scale BNNs efficiently.
arXiv Detail & Related papers (2021-12-12T17:13:14Z) - How Do Adam and Training Strategies Help BNNs Optimization? [50.22482900678071]
We show that Adam is better equipped to handle the rugged loss surface of BNNs and reaches a better optimum with higher generalization ability.
We derive a simple training scheme, building on existing Adam-based optimization, which achieves 70.5% top-1 accuracy on the ImageNet dataset.
arXiv Detail & Related papers (2021-06-21T17:59:51Z) - Encoding the latent posterior of Bayesian Neural Networks for
uncertainty quantification [10.727102755903616]
We aim for efficient deep BNNs amenable to complex computer vision architectures.
We achieve this by leveraging variational autoencoders (VAEs) to learn the interaction and the latent distribution of the parameters at each network layer.
Our approach, Latent-Posterior BNN (LP-BNN), is compatible with the recent BatchEnsemble method, leading to highly efficient (in terms of computation and memory during both training and testing) ensembles.
arXiv Detail & Related papers (2020-12-04T19:50:09Z) - FTBNN: Rethinking Non-linearity for 1-bit CNNs and Going Beyond [23.5996182207431]
We show that binarized convolution process owns an increasing linearity towards the target of minimizing such error, which in turn hampers BNN's discriminative ability.
We re-investigate and tune proper non-linear modules to fix that contradiction, leading to a strong baseline which achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-10-19T08:11:48Z) - Efficient Exact Verification of Binarized Neural Networks [15.639601066641099]
Binarized Neural Networks (BNNs) provide comparable robustness and allow exact and significantly more efficient verification.
We present a new system, EEV, for efficient and exact verification of BNNs.
We demonstrate the effectiveness of EEV by presenting the first exact verification results for L-inf-bounded adversarial robustness of nontrivial convolutional BNNs.
arXiv Detail & Related papers (2020-05-07T16:34:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.