Rotated Binary Neural Network
- URL: http://arxiv.org/abs/2009.13055v3
- Date: Thu, 22 Oct 2020 09:06:18 GMT
- Title: Rotated Binary Neural Network
- Authors: Mingbao Lin, Rongrong Ji, Zihan Xu, Baochang Zhang, Yan Wang, Yongjian
Wu, Feiyue Huang, Chia-Wen Lin
- Abstract summary: Binary Neural Network (BNN) shows its predominance in reducing the complexity of deep neural networks.
One of the major impediments is the large quantization error between the full-precision weight vector and its binary vector.
We introduce a Rotated Binary Neural Network (RBNN) which considers the angle alignment between the full-precision weight vector and its binarized version.
- Score: 138.89237044931937
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Binary Neural Network (BNN) shows its predominance in reducing the complexity
of deep neural networks. However, it suffers severe performance degradation.
One of the major impediments is the large quantization error between the
full-precision weight vector and its binary vector. Previous works focus on
compensating for the norm gap while leaving the angular bias hardly touched. In
this paper, for the first time, we explore the influence of angular bias on the
quantization error and then introduce a Rotated Binary Neural Network (RBNN),
which considers the angle alignment between the full-precision weight vector
and its binarized version. At the beginning of each training epoch, we propose
to rotate the full-precision weight vector to its binary vector to reduce the
angular bias. To avoid the high complexity of learning a large rotation matrix,
we further introduce a bi-rotation formulation that learns two smaller rotation
matrices. In the training stage, we devise an adjustable rotated weight vector
for binarization to escape the potential local optimum. Our rotation leads to
around 50% weight flips which maximize the information gain. Finally, we
propose a training-aware approximation of the sign function for the gradient
backward. Experiments on CIFAR-10 and ImageNet demonstrate the superiorities of
RBNN over many state-of-the-arts. Our source code, experimental settings,
training logs and binary models are available at
https://github.com/lmbxmu/RBNN.
Related papers
- Understanding Neural Network Binarization with Forward and Backward
Proximal Quantizers [26.27829662433536]
In neural network binarization, BinaryConnect (BC) and its variants are considered the standard.
We aim at shedding some light on these training tricks from the optimization perspective.
arXiv Detail & Related papers (2024-02-27T17:43:51Z) - Projected Stochastic Gradient Descent with Quantum Annealed Binary Gradients [51.82488018573326]
We present QP-SBGD, a novel layer-wise optimiser tailored towards training neural networks with binary weights.
BNNs reduce the computational requirements and energy consumption of deep learning models with minimal loss in accuracy.
Our algorithm is implemented layer-wise, making it suitable to train larger networks on resource-limited quantum hardware.
arXiv Detail & Related papers (2023-10-23T17:32:38Z) - Resilient Binary Neural Network [26.63280603795981]
We introduce a Resilient Binary Neural Network (ReBNN) to mitigate the frequent oscillation for better BNNs' training.
Our ReBNN achieves 66.9% Top-1 accuracy with ResNet-18 backbone on the ImageNet dataset.
arXiv Detail & Related papers (2023-02-02T08:51:07Z) - The Onset of Variance-Limited Behavior for Networks in the Lazy and Rich
Regimes [75.59720049837459]
We study the transition from infinite-width behavior to this variance limited regime as a function of sample size $P$ and network width $N$.
We find that finite-size effects can become relevant for very small datasets on the order of $P* sim sqrtN$ for regression with ReLU networks.
arXiv Detail & Related papers (2022-12-23T04:48:04Z) - Recurrent Bilinear Optimization for Binary Neural Networks [58.972212365275595]
BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors.
Our work is the first attempt to optimize BNNs from the bilinear perspective.
We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
arXiv Detail & Related papers (2022-09-04T06:45:33Z) - Bimodal Distributed Binarized Neural Networks [3.0778860202909657]
Binarization techniques, however, suffer from ineligible performance degradation compared to their full-precision counterparts.
We propose a Bi-Modal Distributed binarization method (methodname)
That imposes bi-modal distribution of the network weights by kurtosis regularization.
arXiv Detail & Related papers (2022-04-05T06:07:05Z) - SiMaN: Sign-to-Magnitude Network Binarization [165.5630656849309]
We show that our weight binarization provides an analytical solution by encoding high-magnitude weights into +1s, and 0s otherwise.
We prove that the learned weights of binarized networks roughly follow a Laplacian distribution that does not allow entropy.
Our method, dubbed sign-to- neural network binarization (SiMaN), is evaluated on CIFAR-10 and ImageNet.
arXiv Detail & Related papers (2021-02-16T07:03:51Z) - FTBNN: Rethinking Non-linearity for 1-bit CNNs and Going Beyond [23.5996182207431]
We show that binarized convolution process owns an increasing linearity towards the target of minimizing such error, which in turn hampers BNN's discriminative ability.
We re-investigate and tune proper non-linear modules to fix that contradiction, leading to a strong baseline which achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-10-19T08:11:48Z) - Towards Understanding Hierarchical Learning: Benefits of Neural
Representations [160.33479656108926]
In this work, we demonstrate that intermediate neural representations add more flexibility to neural networks.
We show that neural representation can achieve improved sample complexities compared with the raw input.
Our results characterize when neural representations are beneficial, and may provide a new perspective on why depth is important in deep learning.
arXiv Detail & Related papers (2020-06-24T02:44:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.