Related papers: BEP: A Binary Error Propagation Algorithm for Binary Neural Networks Training

BEP: A Binary Error Propagation Algorithm for Binary Neural Networks Training

URL: http://arxiv.org/abs/2512.04189v1
Date: Wed, 03 Dec 2025 19:03:55 GMT
Title: BEP: A Binary Error Propagation Algorithm for Binary Neural Networks Training
Authors: Luca Colombo, Fabrizio Pittorino, Daniele Zambon, Carlo Baldassi, Manuel Roveri, Cesare Alippi,
Abstract summary: Binary Neural Networks (BNNs) offer substantial reductions in computational complexity, memory footprint, and energy consumption.<n>However, training BNNs via gradient-based optimization remains challenging due to the discrete nature of their variables.<n>This paper introduces Binary Error Propagation (BEP), the first learning algorithm to establish a principled, discrete analog of the backpropagation chain rule.
Score: 21.908847701590428
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Binary Neural Networks (BNNs), which constrain both weights and activations to binary values, offer substantial reductions in computational complexity, memory footprint, and energy consumption. These advantages make them particularly well suited for deployment on resource-constrained devices. However, training BNNs via gradient-based optimization remains challenging due to the discrete nature of their variables. The dominant approach, quantization-aware training, circumvents this issue by employing surrogate gradients. Yet, this method requires maintaining latent full-precision parameters and performing the backward pass with floating-point arithmetic, thereby forfeiting the efficiency of binary operations during training. While alternative approaches based on local learning rules exist, they are unsuitable for global credit assignment and for back-propagating errors in multi-layer architectures. This paper introduces Binary Error Propagation (BEP), the first learning algorithm to establish a principled, discrete analog of the backpropagation chain rule. This mechanism enables error signals, represented as binary vectors, to be propagated backward through multiple layers of a neural network. BEP operates entirely on binary variables, with all forward and backward computations performed using only bitwise operations. Crucially, this makes BEP the first solution to enable end-to-end binary training for recurrent neural network architectures. We validate the effectiveness of BEP on both multi-layer perceptrons and recurrent neural networks, demonstrating gains of up to +6.89% and +10.57% in test accuracy, respectively. The proposed algorithm is released as an open-source repository.

Related papers

Quadratic Unconstrained Binary Optimisation for Training and Regularisation of Binary Neural Networks [0.0]
Training binary neural networks (BNNs) is computationally challenging because of its discrete characteristics.<n>Recent work proposing a framework for training BNNs based on unconstrained binary optimisation (QUBO)<n>We extend existing QUBO models for training BNNs to accommodate arbitrary network topologies and propose two novel methods for regularisation.
arXiv Detail & Related papers (2026-01-01T19:21:03Z)
Training Multi-Layer Binary Neural Networks With Local Binary Error Signals [3.7740044597960316]
We propose a fully binary and gradient-free training algorithm for multi-layer Binary Neural Networks (BNNs)<n>The proposed algorithm relies on local binary error signals and binary weight updates, employing integer-valued hidden weights that serve as a synaptic metaplasticity mechanism.<n> Experimental results show test accuracy improvements of up to +35.47% over the only existing fully binary single-layer state-of-the-art solution.
arXiv Detail & Related papers (2024-11-28T09:12:04Z)
BiPFT: Binary Pre-trained Foundation Transformer with Low-rank Estimation of Binarization Residual Polynomials [27.573329030086676]
This work proposes the first Binary Pretrained Foundation Transformer (BiPFT) for natural language understanding (NLU) tasks. BiPFT exhibits a substantial enhancement in the learning capabilities of binary neural networks (BNNs) Extensive experiments validate the effectiveness of BiPFTs, surpassing task-specific baseline by 15.4% average performance on the GLUE benchmark.
arXiv Detail & Related papers (2023-12-14T13:42:57Z)
Projected Stochastic Gradient Descent with Quantum Annealed Binary Gradients [51.82488018573326]
We present QP-SBGD, a novel layer-wise optimiser tailored towards training neural networks with binary weights. BNNs reduce the computational requirements and energy consumption of deep learning models with minimal loss in accuracy. Our algorithm is implemented layer-wise, making it suitable to train larger networks on resource-limited quantum hardware.
arXiv Detail & Related papers (2023-10-23T17:32:38Z)
Compacting Binary Neural Networks by Sparse Kernel Selection [58.84313343190488]
This paper is motivated by a previously revealed phenomenon that the binary kernels in successful BNNs are nearly power-law distributed. We develop the Permutation Straight-Through Estimator (PSTE) that is able to not only optimize the selection process end-to-end but also maintain the non-repetitive occupancy of selected codewords. Experiments verify that our method reduces both the model size and bit-wise computational costs, and achieves accuracy improvements compared with state-of-the-art BNNs under comparable budgets.
arXiv Detail & Related papers (2023-03-25T13:53:02Z)
The Cascaded Forward Algorithm for Neural Network Training [61.06444586991505]
We propose a new learning framework for neural networks, namely Cascaded Forward (CaFo) algorithm, which does not rely on BP optimization as that in FF. Unlike FF, our framework directly outputs label distributions at each cascaded block, which does not require generation of additional negative samples. In our framework each block can be trained independently, so it can be easily deployed into parallel acceleration systems.
arXiv Detail & Related papers (2023-03-17T02:01:11Z)
Recurrent Bilinear Optimization for Binary Neural Networks [58.972212365275595]
BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors. Our work is the first attempt to optimize BNNs from the bilinear perspective. We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
arXiv Detail & Related papers (2022-09-04T06:45:33Z)
Bimodal Distributed Binarized Neural Networks [3.0778860202909657]
Binarization techniques, however, suffer from ineligible performance degradation compared to their full-precision counterparts. We propose a Bi-Modal Distributed binarization method (methodname) That imposes bi-modal distribution of the network weights by kurtosis regularization.
arXiv Detail & Related papers (2022-04-05T06:07:05Z)
AdaSTE: An Adaptive Straight-Through Estimator to Train Binary Neural Networks [34.263013539187355]
We propose a new algorithm for training deep neural networks (DNNs) with binary weights. Experimental results demonstrate that our new algorithm offers favorable performance compared to existing approaches.
arXiv Detail & Related papers (2021-12-06T09:12:15Z)
Targeted Attack against Deep Neural Networks via Flipping Limited Weight Bits [55.740716446995805]
We study a novel attack paradigm, which modifies model parameters in the deployment stage for malicious purposes. Our goal is to misclassify a specific sample into a target class without any sample modification. By utilizing the latest technique in integer programming, we equivalently reformulate this BIP problem as a continuous optimization problem.
arXiv Detail & Related papers (2021-02-21T03:13:27Z)
Training Binary Neural Networks through Learning with Noisy Supervision [76.26677550127656]
This paper formalizes the binarization operations over neural networks from a learning perspective. Experimental results on benchmark datasets indicate that the proposed binarization technique attains consistent improvements over baselines.
arXiv Detail & Related papers (2020-10-10T01:59:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.