MCEL: Margin-Based Cross-Entropy Loss for Error-Tolerant Quantized Neural Networks
- URL: http://arxiv.org/abs/2603.05048v1
- Date: Thu, 05 Mar 2026 10:58:30 GMT
- Title: MCEL: Margin-Based Cross-Entropy Loss for Error-Tolerant Quantized Neural Networks
- Authors: Mikail Yayla, Akash Kumar,
- Abstract summary: Robustness to bit errors is a key requirement for the reliable use of neural networks (NNs) on emerging approximate computing platforms.<n>A common approach to achieve bit error tolerance in NNs is injecting bit flips during training according to a predefined error model.<n>In this work, we investigate the mechanisms that enable NNs to tolerate bit errors without relying on error-aware training.
- Score: 2.591303779092077
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Robustness to bit errors is a key requirement for the reliable use of neural networks (NNs) on emerging approximate computing platforms and error-prone memory technologies. A common approach to achieve bit error tolerance in NNs is injecting bit flips during training according to a predefined error model. While effective in certain scenarios, training-time bit flip injection introduces substantial computational overhead, often degrades inference accuracy at high error rates, and scales poorly for larger NN architectures. These limitations make error injection an increasingly impractical solution for ensuring robustness on future approximate computing platforms and error-prone memory technologies. In this work, we investigate the mechanisms that enable NNs to tolerate bit errors without relying on error-aware training. We establish a direct connection between bit error tolerance and classification margins at the output layer. Building on this insight, we propose a novel loss function, the Margin Cross-Entropy Loss (MCEL), which explicitly promotes logit-level margin separation while preserving the favorable optimization properties of the standard cross-entropy loss. Furthermore, MCEL introduces an interpretable margin parameter that allows robustness to be tuned in a principled manner. Extensive experimental evaluations across multiple datasets of varying complexity, diverse NN architectures, and a range of quantization schemes demonstrate that MCEL substantially improves bit error tolerance, up to 15 % in accuracy for an error rate of 1 %. Our proposed MCEL method is simple to implement, efficient, and can be integrated as a drop-in replacement for standard CEL. It provides a scalable and principled alternative to training-time bit flip injection, offering new insights into the origins of NN robustness and enabling more efficient deployment on approximate computing and memory systems.
Related papers
- Neural Minimum Weight Perfect Matching for Quantum Error Codes [7.525883733645578]
We propose a data-driven decoder named Neural Minimum Weight Perfect Matching (NMWPM)<n>Our findings demonstrate significant performance reduction in the Logical Error Rate (LER) over standard baselines.
arXiv Detail & Related papers (2026-01-01T07:25:51Z) - Decentralized Nonconvex Composite Federated Learning with Gradient Tracking and Momentum [78.27945336558987]
Decentralized server (DFL) eliminates reliance on client-client architecture.<n>Non-smooth regularization is often incorporated into machine learning tasks.<n>We propose a novel novel DNCFL algorithm to solve these problems.
arXiv Detail & Related papers (2025-04-17T08:32:25Z) - Optimization Guarantees of Unfolded ISTA and ADMM Networks With Smooth
Soft-Thresholding [57.71603937699949]
We study optimization guarantees, i.e., achieving near-zero training loss with the increase in the number of learning epochs.
We show that the threshold on the number of training samples increases with the increase in the network width.
arXiv Detail & Related papers (2023-09-12T13:03:47Z) - Guaranteed Approximation Bounds for Mixed-Precision Neural Operators [83.64404557466528]
We build on intuition that neural operator learning inherently induces an approximation error.
We show that our approach reduces GPU memory usage by up to 50% and improves throughput by 58% with little or no reduction in accuracy.
arXiv Detail & Related papers (2023-07-27T17:42:06Z) - ApproxABFT: Approximate Algorithm-Based Fault Tolerance for Neural Network Processing [7.578258600530223]
Algorithm-based fault tolerance (ABFT) mechanisms have become a promising solution for reliability enhancement.<n>We propose an Approximate ABFT framework that introduces adaptive error tolerance thresholds to enable selective fault recovery.<n>The proposed ApproxABFT achieves a 43.39% average reduction in redundant computing overhead compared to previous accurate ABFT.
arXiv Detail & Related papers (2023-02-21T06:21:28Z) - Learning k-Level Structured Sparse Neural Networks Using Group Envelope Regularization [4.0554893636822]
We introduce a novel approach to deploy large-scale Deep Neural Networks on constrained resources.
The method speeds up inference time and aims to reduce memory demand and power consumption.
arXiv Detail & Related papers (2022-12-25T15:40:05Z) - Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge
Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC)
We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer.
Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z) - Efficient Micro-Structured Weight Unification and Pruning for Neural
Network Compression [56.83861738731913]
Deep Neural Network (DNN) models are essential for practical applications, especially for resource limited devices.
Previous unstructured or structured weight pruning methods can hardly truly accelerate inference.
We propose a generalized weight unification framework at a hardware compatible micro-structured level to achieve high amount of compression and acceleration.
arXiv Detail & Related papers (2021-06-15T17:22:59Z) - Reduced-Order Neural Network Synthesis with Robustness Guarantees [0.0]
Machine learning algorithms are being adapted to run locally on board, potentially hardware limited, devices to improve user privacy, reduce latency and be more energy efficient.
To address this issue, a method to automatically synthesize reduced-order neural networks (having fewer neurons) approxing the input/output mapping of a larger one is introduced.
Worst-case bounds for this approximation error are obtained and the approach can be applied to a wide variety of neural networks architectures.
arXiv Detail & Related papers (2021-02-18T12:03:57Z) - Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model.
This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs)
The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z) - Towards Explainable Bit Error Tolerance of Resistive RAM-Based Binarized
Neural Networks [7.349786872131006]
Non-volatile memory, such as resistive RAM (RRAM), is an emerging energy-efficient storage.
Binary neural networks (BNNs) can tolerate a certain percentage of errors without a loss in accuracy.
The bit error tolerance (BET) in BNNs can be achieved by flipping the weight signs during training.
arXiv Detail & Related papers (2020-02-03T17:38:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.