Understanding Learning Dynamics of Binary Neural Networks via
Information Bottleneck
- URL: http://arxiv.org/abs/2006.07522v1
- Date: Sat, 13 Jun 2020 00:39:25 GMT
- Title: Understanding Learning Dynamics of Binary Neural Networks via
Information Bottleneck
- Authors: Vishnu Raj, Nancy Nayak and Sheetal Kalyani
- Abstract summary: Binary Neural Networks (BNNs) take compactification to the extreme by constraining both weights and activations to two levels, $+1, -1$.
We analyze BNN training through the Information Bottleneck principle and observe that the training dynamics of BNNs is considerably different from that of Deep Neural Networks (DNNs)
Since BNNs have a less expressive capacity, they tend to find efficient hidden representations concurrently with label fitting.
- Score: 11.17667928756077
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Compact neural networks are essential for affordable and power efficient deep
learning solutions. Binary Neural Networks (BNNs) take compactification to the
extreme by constraining both weights and activations to two levels, $\{+1,
-1\}$. However, training BNNs are not easy due to the discontinuity in
activation functions, and the training dynamics of BNNs is not well understood.
In this paper, we present an information-theoretic perspective of BNN training.
We analyze BNNs through the Information Bottleneck principle and observe that
the training dynamics of BNNs is considerably different from that of Deep
Neural Networks (DNNs). While DNNs have a separate empirical risk minimization
and representation compression phases, our numerical experiments show that in
BNNs, both these phases are simultaneous. Since BNNs have a less expressive
capacity, they tend to find efficient hidden representations concurrently with
label fitting. Experiments in multiple datasets support these observations, and
we see a consistent behavior across different activation functions in BNNs.
Related papers
- NAS-BNN: Neural Architecture Search for Binary Neural Networks [55.058512316210056]
We propose a novel neural architecture search scheme for binary neural networks, named NAS-BNN.
Our discovered binary model family outperforms previous BNNs for a wide range of operations (OPs) from 20M to 200M.
In addition, we validate the transferability of these searched BNNs on the object detection task, and our binary detectors with the searched BNNs achieve a novel state-of-the-art result, e.g., 31.6% mAP with 370M OPs, on MS dataset.
arXiv Detail & Related papers (2024-08-28T02:17:58Z) - Spiking Convolutional Neural Networks for Text Classification [15.10637945787922]
Spiking neural networks (SNNs) offer a promising pathway to implement deep neural networks (DNNs) in a more energy-efficient manner.
This work presents a "conversion + fine-tuning" two-step method for training SNNs for text classification and proposes a simple but effective way to encode pre-trained word embeddings as spike trains.
arXiv Detail & Related papers (2024-06-27T14:54:27Z) - Make Me a BNN: A Simple Strategy for Estimating Bayesian Uncertainty
from Pre-trained Models [40.38541033389344]
Deep Neural Networks (DNNs) are powerful tools for various computer vision tasks, yet they often struggle with reliable uncertainty quantification.
We introduce the Adaptable Bayesian Neural Network (ABNN), a simple and scalable strategy to seamlessly transform DNNs into BNNs.
We conduct extensive experiments across multiple datasets for image classification and semantic segmentation tasks, and our results demonstrate that ABNN achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-12-23T16:39:24Z) - Recurrent Bilinear Optimization for Binary Neural Networks [58.972212365275595]
BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors.
Our work is the first attempt to optimize BNNs from the bilinear perspective.
We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
arXiv Detail & Related papers (2022-09-04T06:45:33Z) - Spatial-Temporal-Fusion BNN: Variational Bayesian Feature Layer [77.78479877473899]
We design a spatial-temporal-fusion BNN for efficiently scaling BNNs to large models.
Compared to vanilla BNNs, our approach can greatly reduce the training time and the number of parameters, which contributes to scale BNNs efficiently.
arXiv Detail & Related papers (2021-12-12T17:13:14Z) - Robustness of Bayesian Neural Networks to White-Box Adversarial Attacks [55.531896312724555]
Bayesian Networks (BNNs) are robust and adept at handling adversarial attacks by incorporating randomness.
We create our BNN model, called BNN-DenseNet, by fusing Bayesian inference (i.e., variational Bayes) to the DenseNet architecture.
An adversarially-trained BNN outperforms its non-Bayesian, adversarially-trained counterpart in most experiments.
arXiv Detail & Related papers (2021-11-16T16:14:44Z) - Sub-bit Neural Networks: Learning to Compress and Accelerate Binary
Neural Networks [72.81092567651395]
Sub-bit Neural Networks (SNNs) are a new type of binary quantization design tailored to compress and accelerate BNNs.
SNNs are trained with a kernel-aware optimization framework, which exploits binary quantization in the fine-grained convolutional kernel space.
Experiments on visual recognition benchmarks and the hardware deployment on FPGA validate the great potentials of SNNs.
arXiv Detail & Related papers (2021-10-18T11:30:29Z) - Dynamic Binary Neural Network by learning channel-wise thresholds [9.432747511001246]
We propose a dynamic BNN (DyBNN) incorporating dynamic learnable channel-wise thresholds of Sign function and shift parameters of PReLU.
The DyBNN based on two backbones of ReActNet (MobileNetV1 and ResNet18) achieve 71.2% and 67.4% top1-accuracy on ImageNet dataset.
arXiv Detail & Related papers (2021-10-08T17:41:36Z) - BCNN: Binary Complex Neural Network [16.82755328827758]
Binarized neural networks, or BNNs, show great promise in edge-side applications with resource limited hardware.
We introduce complex representation into the BNNs and propose Binary complex neural network.
BCNN improves BNN by strengthening its learning capability through complex representation and extending its applicability to complex-valued input data.
arXiv Detail & Related papers (2021-03-28T03:35:24Z) - BDD4BNN: A BDD-based Quantitative Analysis Framework for Binarized
Neural Networks [7.844146033635129]
We study verification problems for Binarized Neural Networks (BNNs), the 1-bit quantization of general real-numbered neural networks.
Our approach is to encode BNNs into Binary Decision Diagrams (BDDs), which is done by exploiting the internal structure of the BNNs.
Based on the encoding, we develop a quantitative verification framework for BNNs where precise and comprehensive analysis of BNNs can be performed.
arXiv Detail & Related papers (2021-03-12T12:02:41Z) - S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural
Networks via Guided Distribution Calibration [74.5509794733707]
We present a novel guided learning paradigm from real-valued to distill binary networks on the final prediction distribution.
Our proposed method can boost the simple contrastive learning baseline by an absolute gain of 5.515% on BNNs.
Our method achieves substantial improvement over the simple contrastive learning baseline, and is even comparable to many mainstream supervised BNN methods.
arXiv Detail & Related papers (2021-02-17T18:59:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.