Related papers: On-Chip Learning with Memristor-Based Neural Networks: Assessing Accuracy and Efficiency Under Device Variations, Conductance Errors, and Input Noise

On-Chip Learning with Memristor-Based Neural Networks: Assessing Accuracy and Efficiency Under Device Variations, Conductance Errors, and Input Noise

URL: http://arxiv.org/abs/2408.14680v1
Date: Mon, 26 Aug 2024 23:10:01 GMT
Title: On-Chip Learning with Memristor-Based Neural Networks: Assessing Accuracy and Efficiency Under Device Variations, Conductance Errors, and Input Noise
Authors: M. Reza Eslami, Dhiman Biswas, Soheib Takhtardeshir, Sarah S. Sharif, Yaser M. Banad,
Abstract summary: This paper presents a memristor-based compute-in-memory hardware accelerator for on-chip training and inference. Hardware, consisting of 30 memristors and 4 neurons, utilizes three different M-SDC structures with tungsten, chromium, and carbon media to perform binary image classification tasks.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper presents a memristor-based compute-in-memory hardware accelerator for on-chip training and inference, focusing on its accuracy and efficiency against device variations, conductance errors, and input noise. Utilizing realistic SPICE models of commercially available silver-based metal self-directed channel (M-SDC) memristors, the study incorporates inherent device non-idealities into the circuit simulations. The hardware, consisting of 30 memristors and 4 neurons, utilizes three different M-SDC structures with tungsten, chromium, and carbon media to perform binary image classification tasks. An on-chip training algorithm precisely tunes memristor conductance to achieve target weights. Results show that incorporating moderate noise (<15%) during training enhances robustness to device variations and noisy input data, achieving up to 97% accuracy despite conductance variations and input noises. The network tolerates a 10% conductance error without significant accuracy loss. Notably, omitting the initial memristor reset pulse during training considerably reduces training time and energy consumption. The hardware designed with chromium-based memristors exhibits superior performance, achieving a training time of 2.4 seconds and an energy consumption of 18.9 mJ. This research provides insights for developing robust and energy-efficient memristor-based neural networks for on-chip learning in edge applications.

Related papers

Efficient Memristive Spiking Neural Networks Architecture with Supervised In-Situ STDP Method [0.0]
Memristor-based Spiking Neural Networks (SNNs) with temporal spike encoding enable ultra-low-energy computation.<n>This paper presents a circuit-level memristive spiking neural network (SNN) architecture trained using a proposed novel supervised in-situ learning algorithm.
arXiv Detail & Related papers (2025-07-28T17:09:48Z)
Rapid yet accurate Tile-circuit and device modeling for Analog In-Memory Computing [4.566622328597218]
We quantify the impact of low-level distortions and noise, and develop a mathematical model for Multiply-ACcum (MAC) operations mapped to analog tiles.<n>We show that hardware fine-tuning using simple Gaussian noise provides resilience against ADC quantization and PCM read noise effects, but is less effective against IR-drop.
arXiv Detail & Related papers (2025-05-05T22:56:49Z)
Demonstration of Enhanced Qubit Readout via Reinforcement Learning [0.0]
We harness model-free reinforcement learning (RL) together with a tailored training environment to achieve this multi-pronged optimization task. We demonstrate on an IBM quantum device that the measurement pulse obtained by the RL agent achieves state-of-the-art performance.
arXiv Detail & Related papers (2024-12-05T10:43:36Z)
Synergistic Development of Perovskite Memristors and Algorithms for Robust Analog Computing [53.77822620185878]
We propose a synergistic methodology to concurrently optimize perovskite memristor fabrication and develop robust analog DNNs. We develop "BayesMulti", a training strategy utilizing BO-guided noise injection to improve the resistance of analog DNNs to memristor imperfections. Our integrated approach enables use of analog computing in much deeper and wider networks, achieving up to 100-fold improvements.
arXiv Detail & Related papers (2024-12-03T19:20:08Z)
GDeR: Safeguarding Efficiency, Balancing, and Robustness via Prototypical Graph Pruning [44.401418612374286]
We introduce a novel soft-pruning method, GDeR, designed to update the training during the process using trainable prototypes. GDeR achieves or surpasses the performance of the full dataset with 30%50% fewer training samples. It also outperforms state-of-the-art pruning methods in imbalanced training and noisy training scenarios.
arXiv Detail & Related papers (2024-10-17T16:56:01Z)
Mem-elements based Neuromorphic Hardware for Neural Network Application [0.0]
The thesis investigates the utilization of memristive and memcapacitive crossbar arrays in low-power machine learning accelerators, offering a comprehensive co-design framework for deep neural networks (DNN) The model, implemented through a hybrid Python and PyTorch approach, accounts for various non-idealities, achieving exceptional training accuracies of 90.02% and 91.03% for the CIFAR-10 dataset with memristive and memcapacitive crossbar arrays on an 8-layer VGG network.
arXiv Detail & Related papers (2024-03-05T14:28:40Z)
Cal-DETR: Calibrated Detection Transformer [67.75361289429013]
We propose a mechanism for calibrated detection transformers (Cal-DETR), particularly for Deformable-DETR, UP-DETR and DINO. We develop an uncertainty-guided logit modulation mechanism that leverages the uncertainty to modulate the class logits. Results corroborate the effectiveness of Cal-DETR against the competing train-time methods in calibrating both in-domain and out-domain detections.
arXiv Detail & Related papers (2023-11-06T22:13:10Z)
Quantized Neural Networks for Low-Precision Accumulation with Guaranteed Overflow Avoidance [68.8204255655161]
We introduce a quantization-aware training algorithm that guarantees avoiding numerical overflow when reducing the precision of accumulators during inference. We evaluate our algorithm across multiple quantized models that we train for different tasks, showing that our approach can reduce the precision of accumulators while maintaining model accuracy with respect to a floating-point baseline.
arXiv Detail & Related papers (2023-01-31T02:46:57Z)
CorrectNet: Robustness Enhancement of Analog In-Memory Computing for Neural Networks by Error Suppression and Compensation [4.570841222958966]
We propose a framework to enhance the robustness of neural networks under variations and noise. We show that inference accuracy of neural networks can be recovered from as low as 1.69% under variations and noise.
arXiv Detail & Related papers (2022-11-27T19:13:33Z)
Energy Efficient Learning with Low Resolution Stochastic Domain Wall Synapse Based Deep Neural Networks [0.9176056742068814]
We demonstrate that extremely low resolution quantized (nominally 5-state) synapses with large variations in Domain Wall (DW) position can be both energy efficient and achieve reasonably high testing accuracies. We show that by implementing suitable modifications to the learning algorithms, we can address the behavior as well as the effect of their low-resolution to achieve high testing accuracies.
arXiv Detail & Related papers (2021-11-14T09:12:29Z)
Hybrid In-memory Computing Architecture for the Training of Deep Neural Networks [5.050213408539571]
We propose a hybrid in-memory computing architecture for the training of deep neural networks (DNNs) on hardware accelerators. We show that HIC-based training results in about 50% less inference model size to achieve baseline comparable accuracy. Our simulations indicate HIC-based training naturally ensures that the number of write-erase cycles seen by the devices is a small fraction of the endurance limit of PCM.
arXiv Detail & Related papers (2021-02-10T05:26:27Z)
SmartDeal: Re-Modeling Deep Network Weights for Efficient Inference and Training [82.35376405568975]
Deep neural networks (DNNs) come with heavy parameterization, leading to external dynamic random-access memory (DRAM) for storage. We present SmartDeal (SD), an algorithm framework to trade higher-cost memory storage/access for lower-cost computation. We show that SD leads to 10.56x and 4.48x reduction in the storage and training energy, with negligible accuracy loss compared to state-of-the-art training baselines.
arXiv Detail & Related papers (2021-01-04T18:54:07Z)
FracTrain: Fractionally Squeezing Bit Savings Both Temporally and Spatially for Efficient DNN Training [81.85361544720885]
We propose FracTrain that integrates progressive fractional quantization which gradually increases the precision of activations, weights, and gradients. FracTrain reduces computational cost and hardware-quantified energy/latency of DNN training while achieving a comparable or better (-0.12%+1.87%) accuracy.
arXiv Detail & Related papers (2020-12-24T05:24:10Z)
Bit Error Robustness for Energy-Efficient DNN Accelerators [93.58572811484022]
We show that a combination of robust fixed-point quantization, weight clipping, and random bit error training (RandBET) improves robustness against random bit errors. This leads to high energy savings from both low-voltage operation as well as low-precision quantization.
arXiv Detail & Related papers (2020-06-24T18:23:10Z)
Training End-to-End Analog Neural Networks with Equilibrium Propagation [64.0476282000118]
We introduce a principled method to train end-to-end analog neural networks by gradient descent. We show mathematically that a class of analog neural networks (called nonlinear resistive networks) are energy-based models. Our work can guide the development of a new generation of ultra-fast, compact and low-power neural networks supporting on-chip learning.
arXiv Detail & Related papers (2020-06-02T23:38:35Z)
Convolutional-Recurrent Neural Networks on Low-Power Wearable Platforms for Cardiac Arrhythmia Detection [0.18459705687628122]
We focus on the inference of neural networks running in microcontrollers and low-power processors. We adapted an existing convolutional-recurrent neural network to detect and classify cardiac arrhythmias. We show our implementation in fixed-point precision, using the CMSIS-NN libraries, with a memory footprint of 195.6KB, and a throughput of 33.98MOps/s.
arXiv Detail & Related papers (2020-01-08T10:35:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.