Training Multi-Layer Binary Neural Networks With Local Binary Error Signals
- URL: http://arxiv.org/abs/2412.00119v2
- Date: Sun, 23 Mar 2025 12:59:38 GMT
- Title: Training Multi-Layer Binary Neural Networks With Local Binary Error Signals
- Authors: Luca Colombo, Fabrizio Pittorino, Manuel Roveri,
- Abstract summary: Binary Neural Networks (BNNs) reduce computational and memory usage in machine and deep learning by representing activations with just one bit.<n>Most existing training algorithms for BNNs rely on floating-point Descent (SGD) limiting the full exploitation of binary operations.<n>We propose for the first time a fully binary and gradient-free algorithm for training BNNs.
- Score: 3.7740044597960316
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Binary Neural Networks (BNNs) significantly reduce computational complexity and memory usage in machine and deep learning by representing weights and activations with just one bit. However, most existing training algorithms for BNNs rely on quantization-aware floating-point Stochastic Gradient Descent (SGD), limiting the full exploitation of binary operations to the inference phase only. In this work, we propose, for the first time, a fully binary and gradient-free training algorithm for multi-layer BNNs, eliminating the need for back-propagated floating-point gradients. Specifically, the proposed algorithm relies on local binary error signals and binary weight updates, employing integer-valued hidden weights that serve as a synaptic metaplasticity mechanism, thereby enhancing its neurobiological plausibility. The fully binary and gradient-free algorithm introduced in this paper enables the training of binary multi-layer perceptrons with binary inputs, weights, and activations, by using exclusively XNOR, Popcount, and increment/decrement operations. Experimental results on multi-class classification benchmarks show test accuracy improvements of up to +35.47% over the only existing fully binary single-layer state-of-the-art solution. Compared to full-precision SGD, our solution improves test accuracy by up to +41.31% under the same total memory demand$\unicode{x2013}$including the model, activations, and input dataset$\unicode{x2013}$while also reducing computational cost by two orders of magnitude in terms of the total number of equivalent Boolean gates. The proposed algorithm is made available to the scientific community as a public repository.
Related papers
- BiDense: Binarization for Dense Prediction [62.70804353158387]
BiDense is a generalized binary neural network (BNN) designed for efficient and accurate dense prediction tasks.
BiDense incorporates two key techniques: the Distribution-adaptive Binarizer (DAB) and the Channel-adaptive Full-precision Bypass (CFB)
arXiv Detail & Related papers (2024-11-15T16:46:04Z) - Training Multi-layer Neural Networks on Ising Machine [41.95720316032297]
This paper proposes an Ising learning algorithm to train quantized neural network (QNN)
As far as we know, this is the first algorithm to train multi-layer feedforward networks on Ising machines.
arXiv Detail & Related papers (2023-11-06T04:09:15Z) - Projected Stochastic Gradient Descent with Quantum Annealed Binary Gradients [51.82488018573326]
We present QP-SBGD, a novel layer-wise optimiser tailored towards training neural networks with binary weights.
BNNs reduce the computational requirements and energy consumption of deep learning models with minimal loss in accuracy.
Our algorithm is implemented layer-wise, making it suitable to train larger networks on resource-limited quantum hardware.
arXiv Detail & Related papers (2023-10-23T17:32:38Z) - Input Layer Binarization with Bit-Plane Encoding [4.872439392746007]
We present a new method to binarize the first layer using directly the 8-bit representation of input data.
The resulting model is fully binarized and our first layer binarization approach is model independent.
arXiv Detail & Related papers (2023-05-04T14:49:07Z) - Binary stochasticity enabled highly efficient neuromorphic deep learning
achieves better-than-software accuracy [17.11946381948498]
Deep learning needs high-precision handling of forwarding signals, backpropagating errors, and updating weights.
It is challenging to implement deep learning in hardware systems that use noisy analog memristors as artificial synapses.
We propose a binary learning algorithm that modifies all elementary neural network operations.
arXiv Detail & Related papers (2023-04-25T14:38:36Z) - Compacting Binary Neural Networks by Sparse Kernel Selection [58.84313343190488]
This paper is motivated by a previously revealed phenomenon that the binary kernels in successful BNNs are nearly power-law distributed.
We develop the Permutation Straight-Through Estimator (PSTE) that is able to not only optimize the selection process end-to-end but also maintain the non-repetitive occupancy of selected codewords.
Experiments verify that our method reduces both the model size and bit-wise computational costs, and achieves accuracy improvements compared with state-of-the-art BNNs under comparable budgets.
arXiv Detail & Related papers (2023-03-25T13:53:02Z) - Recurrent Bilinear Optimization for Binary Neural Networks [58.972212365275595]
BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors.
Our work is the first attempt to optimize BNNs from the bilinear perspective.
We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
arXiv Detail & Related papers (2022-09-04T06:45:33Z) - AdaBin: Improving Binary Neural Networks with Adaptive Binary Sets [27.022212653067367]
This paper studies the Binary Neural Networks (BNNs) in which weights and activations are both binarized into 1-bit values.
We present a simple yet effective approach called AdaBin to adaptively obtain the optimal binary sets.
Experimental results on benchmark models and datasets demonstrate that the proposed AdaBin is able to achieve state-of-the-art performance.
arXiv Detail & Related papers (2022-08-17T05:43:33Z) - Network Binarization via Contrastive Learning [16.274341164897827]
We establish a novel contrastive learning framework while training Binary Neural Networks (BNNs)
MI is introduced as the metric to measure the information shared between binary and FP activations.
Results show that our method can be implemented as a pile-up module on existing state-of-the-art binarization methods.
arXiv Detail & Related papers (2022-07-06T21:04:53Z) - Bimodal Distributed Binarized Neural Networks [3.0778860202909657]
Binarization techniques, however, suffer from ineligible performance degradation compared to their full-precision counterparts.
We propose a Bi-Modal Distributed binarization method (methodname)
That imposes bi-modal distribution of the network weights by kurtosis regularization.
arXiv Detail & Related papers (2022-04-05T06:07:05Z) - AdaSTE: An Adaptive Straight-Through Estimator to Train Binary Neural
Networks [34.263013539187355]
We propose a new algorithm for training deep neural networks (DNNs) with binary weights.
Experimental results demonstrate that our new algorithm offers favorable performance compared to existing approaches.
arXiv Detail & Related papers (2021-12-06T09:12:15Z) - Sub-bit Neural Networks: Learning to Compress and Accelerate Binary
Neural Networks [72.81092567651395]
Sub-bit Neural Networks (SNNs) are a new type of binary quantization design tailored to compress and accelerate BNNs.
SNNs are trained with a kernel-aware optimization framework, which exploits binary quantization in the fine-grained convolutional kernel space.
Experiments on visual recognition benchmarks and the hardware deployment on FPGA validate the great potentials of SNNs.
arXiv Detail & Related papers (2021-10-18T11:30:29Z) - Spike time displacement based error backpropagation in convolutional
spiking neural networks [0.6193838300896449]
In this paper, we extend the STiDi-BP algorithm to employ it in deeper and convolutional architectures.
The evaluation results on the image classification task based on two popular benchmarks, MNIST and Fashion-MNIST, confirm that this algorithm has been applicable in deep SNNs.
We consider a convolutional SNN with two sets of weights: real-valued weights that are updated in the backward pass and their signs, binary weights, that are employed in the feedforward process.
arXiv Detail & Related papers (2021-08-31T05:18:59Z) - Learning Frequency Domain Approximation for Binary Neural Networks [68.79904499480025]
We propose to estimate the gradient of sign function in the Fourier frequency domain using the combination of sine functions for training BNNs.
The experiments on several benchmark datasets and neural architectures illustrate that the binary network learned using our method achieves the state-of-the-art accuracy.
arXiv Detail & Related papers (2021-03-01T08:25:26Z) - Learning Low-rank Deep Neural Networks via Singular Vector Orthogonality
Regularization and Singular Value Sparsification [53.50708351813565]
We propose SVD training, the first method to explicitly achieve low-rank DNNs during training without applying SVD on every step.
We empirically show that SVD training can significantly reduce the rank of DNN layers and achieve higher reduction on computation load under the same accuracy.
arXiv Detail & Related papers (2020-04-20T02:40:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.