SiMaN: Sign-to-Magnitude Network Binarization
- URL: http://arxiv.org/abs/2102.07981v1
- Date: Tue, 16 Feb 2021 07:03:51 GMT
- Title: SiMaN: Sign-to-Magnitude Network Binarization
- Authors: Mingbao Lin, Rongrong Ji, Zihan Xu, Baochang Zhang, Fei Chao,
Mingliang Xu, Chia-Wen Lin, Ling Shao
- Abstract summary: We show that our weight binarization provides an analytical solution by encoding high-magnitude weights into +1s, and 0s otherwise.
We prove that the learned weights of binarized networks roughly follow a Laplacian distribution that does not allow entropy.
Our method, dubbed sign-to- neural network binarization (SiMaN), is evaluated on CIFAR-10 and ImageNet.
- Score: 165.5630656849309
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Binary neural networks (BNNs) have attracted broad research interest due to
their efficient storage and computational ability. Nevertheless, a significant
challenge of BNNs lies in handling discrete constraints while ensuring bit
entropy maximization, which typically makes their weight optimization very
difficult. Existing methods relax the learning using the sign function, which
simply encodes positive weights into +1s, and -1s otherwise. Alternatively, we
formulate an angle alignment objective to constrain the weight binarization to
{0,+1} to solve the challenge. In this paper, we show that our weight
binarization provides an analytical solution by encoding high-magnitude weights
into +1s, and 0s otherwise. Therefore, a high-quality discrete solution is
established in a computationally efficient manner without the sign function. We
prove that the learned weights of binarized networks roughly follow a Laplacian
distribution that does not allow entropy maximization, and further demonstrate
that it can be effectively solved by simply removing the $\ell_2$
regularization during network training. Our method, dubbed sign-to-magnitude
network binarization (SiMaN), is evaluated on CIFAR-10 and ImageNet,
demonstrating its superiority over the sign-based state-of-the-arts. Code is at
https://github.com/lmbxmu/SiMaN.
Related papers
- Training via quantum superposition circumventing local minima and vanishing gradient of sinusoidal neural network [0.6021787236982659]
We present an algorithm for quantum training of deep neural networks (SinNNs)
The quantum training evolves an initially uniform superposition over weight values to one that is guaranteed to peak on the best weights.
We demonstrate the algorithm on toy examples and show that it indeed outperforms gradient descent in optimizing the loss function and outperforms brute force search in the time required.
arXiv Detail & Related papers (2024-10-29T13:06:46Z) - Training Multi-layer Neural Networks on Ising Machine [41.95720316032297]
This paper proposes an Ising learning algorithm to train quantized neural network (QNN)
As far as we know, this is the first algorithm to train multi-layer feedforward networks on Ising machines.
arXiv Detail & Related papers (2023-11-06T04:09:15Z) - Projected Stochastic Gradient Descent with Quantum Annealed Binary Gradients [51.82488018573326]
We present QP-SBGD, a novel layer-wise optimiser tailored towards training neural networks with binary weights.
BNNs reduce the computational requirements and energy consumption of deep learning models with minimal loss in accuracy.
Our algorithm is implemented layer-wise, making it suitable to train larger networks on resource-limited quantum hardware.
arXiv Detail & Related papers (2023-10-23T17:32:38Z) - AdaBin: Improving Binary Neural Networks with Adaptive Binary Sets [27.022212653067367]
This paper studies the Binary Neural Networks (BNNs) in which weights and activations are both binarized into 1-bit values.
We present a simple yet effective approach called AdaBin to adaptively obtain the optimal binary sets.
Experimental results on benchmark models and datasets demonstrate that the proposed AdaBin is able to achieve state-of-the-art performance.
arXiv Detail & Related papers (2022-08-17T05:43:33Z) - Robust Training and Verification of Implicit Neural Networks: A
Non-Euclidean Contractive Approach [64.23331120621118]
This paper proposes a theoretical and computational framework for training and robustness verification of implicit neural networks.
We introduce a related embedded network and show that the embedded network can be used to provide an $ell_infty$-norm box over-approximation of the reachable sets of the original network.
We apply our algorithms to train implicit neural networks on the MNIST dataset and compare the robustness of our models with the models trained via existing approaches in the literature.
arXiv Detail & Related papers (2022-08-08T03:13:24Z) - Bimodal Distributed Binarized Neural Networks [3.0778860202909657]
Binarization techniques, however, suffer from ineligible performance degradation compared to their full-precision counterparts.
We propose a Bi-Modal Distributed binarization method (methodname)
That imposes bi-modal distribution of the network weights by kurtosis regularization.
arXiv Detail & Related papers (2022-04-05T06:07:05Z) - Algorithms for Efficiently Learning Low-Rank Neural Networks [12.916132936159713]
We study algorithms for learning low-rank neural networks.
We present a provably efficient algorithm which learns an optimal low-rank approximation to a single-hidden-layer ReLU network.
We propose a novel low-rank framework for training low-rank $textitdeep$ networks.
arXiv Detail & Related papers (2022-02-02T01:08:29Z) - Quantized Neural Networks via {-1, +1} Encoding Decomposition and
Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks.
We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z) - Learning Frequency Domain Approximation for Binary Neural Networks [68.79904499480025]
We propose to estimate the gradient of sign function in the Fourier frequency domain using the combination of sine functions for training BNNs.
The experiments on several benchmark datasets and neural architectures illustrate that the binary network learned using our method achieves the state-of-the-art accuracy.
arXiv Detail & Related papers (2021-03-01T08:25:26Z) - Training Binary Neural Networks with Real-to-Binary Convolutions [52.91164959767517]
We show how to train binary networks to within a few percent points of the full precision counterpart.
We show how to build a strong baseline, which already achieves state-of-the-art accuracy.
We show that, when putting all of our improvements together, the proposed model beats the current state of the art by more than 5% top-1 accuracy on ImageNet.
arXiv Detail & Related papers (2020-03-25T17:54:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.