Related papers: Energy Efficient Learning with Low Resolution Stochastic Domain Wall Synapse Based Deep Neural Networks

Energy Efficient Learning with Low Resolution Stochastic Domain Wall Synapse Based Deep Neural Networks

URL: http://arxiv.org/abs/2111.07284v1
Date: Sun, 14 Nov 2021 09:12:29 GMT
Title: Energy Efficient Learning with Low Resolution Stochastic Domain Wall Synapse Based Deep Neural Networks
Authors: Walid A. Misba, Mark Lozano, Damien Querlioz, Jayasimha Atulasimha
Abstract summary: We demonstrate that extremely low resolution quantized (nominally 5-state) synapses with large variations in Domain Wall (DW) position can be both energy efficient and achieve reasonably high testing accuracies. We show that by implementing suitable modifications to the learning algorithms, we can address the behavior as well as the effect of their low-resolution to achieve high testing accuracies.
Score: 0.9176056742068814
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: We demonstrate that extremely low resolution quantized (nominally 5-state) synapses with large stochastic variations in Domain Wall (DW) position can be both energy efficient and achieve reasonably high testing accuracies compared to Deep Neural Networks (DNNs) of similar sizes using floating precision synaptic weights. Specifically, voltage controlled DW devices demonstrate stochastic behavior as modeled rigorously with micromagnetic simulations and can only encode limited states; however, they can be extremely energy efficient during both training and inference. We show that by implementing suitable modifications to the learning algorithms, we can address the stochastic behavior as well as mitigate the effect of their low-resolution to achieve high testing accuracies. In this study, we propose both in-situ and ex-situ training algorithms, based on modification of the algorithm proposed by Hubara et al. [1] which works well with quantization of synaptic weights. We train several 5-layer DNNs on MNIST dataset using 2-, 3- and 5-state DW device as synapse. For in-situ training, a separate high precision memory unit is adopted to preserve and accumulate the weight gradients, which are then quantized to program the low precision DW devices. Moreover, a sizeable noise tolerance margin is used during the training to address the intrinsic programming noise. For ex-situ training, a precursor DNN is first trained based on the characterized DW device model and a noise tolerance margin, which is similar to the in-situ training. Remarkably, for in-situ inference the energy dissipation to program the devices is only 13 pJ per inference given that the training is performed over the entire MNIST dataset for 10 epochs.

Related papers

Quantized Non-Volatile Nanomagnetic Synapse based Autoencoder for Efficient Unsupervised Network Anomaly Detection [0.07892577704654172]
We show that implementing the autoencoder in edge devices capable of learning in real-time is challenging due to limited hardware, energy, and computational resources. We propose a ferromagnetic racetrack with engineered notches hosting a magnetic domain wall (DW) as the autoencoder synapses. Our DW-based approach demonstrates a remarkable reduction of at least three orders of magnitude in weight updates during training compared to the floating-point approach.
arXiv Detail & Related papers (2023-09-12T02:29:09Z)
Convolutional Monge Mapping Normalization for learning on sleep data [63.22081662149488]
We propose a new method called Convolutional Monge Mapping Normalization (CMMN) CMMN consists in filtering the signals in order to adapt their power spectrum density (PSD) to a Wasserstein barycenter estimated on training data. Numerical experiments on sleep EEG data show that CMMN leads to significant and consistent performance gains independent from the neural network architecture.
arXiv Detail & Related papers (2023-05-30T08:24:01Z)
Stochastic Domain Wall-Magnetic Tunnel Junction Artificial Neurons for Noise-Resilient Spiking Neural Networks [0.0]
We present a scaled DW-MTJ neuron with voltage-dependent probability firing. validation accuracy during training was also shown to be comparable to an ideal integrate and fire device. This work shows that DW-MTJ devices can be used to construct noise-resilient networks suitable for neuromorphic computing on the edge.
arXiv Detail & Related papers (2023-04-10T18:00:26Z)
Implicit Stochastic Gradient Descent for Training Physics-informed Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems. PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features. In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z)
SPIDE: A Purely Spike-based Method for Training Feedback Spiking Neural Networks [56.35403810762512]
Spiking neural networks (SNNs) with event-based computation are promising brain-inspired models for energy-efficient applications on neuromorphic hardware. We study spike-based implicit differentiation on the equilibrium state (SPIDE) that extends the recently proposed training method.
arXiv Detail & Related papers (2023-02-01T04:22:59Z)
Mixed Precision Low-bit Quantization of Neural Network Language Models for Speech Recognition [67.95996816744251]
State-of-the-art language models (LMs) represented by long-short term memory recurrent neural networks (LSTM-RNNs) and Transformers are becoming increasingly complex and expensive for practical applications. Current quantization methods are based on uniform precision and fail to account for the varying performance sensitivity at different parts of LMs to quantization errors. Novel mixed precision neural network LM quantization methods are proposed in this paper.
arXiv Detail & Related papers (2021-11-29T12:24:02Z)
Low-Precision Training in Logarithmic Number System using Multiplicative Weight Update [49.948082497688404]
Training large-scale deep neural networks (DNNs) currently requires a significant amount of energy, leading to serious environmental impacts. One promising approach to reduce the energy costs is representing DNNs with low-precision numbers. We jointly design a lowprecision training framework involving a logarithmic number system (LNS) and a multiplicative weight update training method, termed LNS-Madam.
arXiv Detail & Related papers (2021-06-26T00:32:17Z)
Incorporating NODE with Pre-trained Neural Differential Operator for Learning Dynamics [73.77459272878025]
We propose to enhance the supervised signal in learning dynamics by pre-training a neural differential operator (NDO) NDO is pre-trained on a class of symbolic functions, and it learns the mapping between the trajectory samples of these functions to their derivatives. We provide theoretical guarantee on that the output of NDO can well approximate the ground truth derivatives by proper tuning the complexity of the library.
arXiv Detail & Related papers (2021-06-08T08:04:47Z)
QUANOS- Adversarial Noise Sensitivity Driven Hybrid Quantization of Neural Networks [3.2242513084255036]
QUANOS is a framework that performs layer-specific hybrid quantization based on Adversarial Noise Sensitivity (ANS) Our experiments on CIFAR10, CIFAR100 datasets show that QUANOS outperforms homogenously quantized 8-bit precision baseline in terms of adversarial robustness.
arXiv Detail & Related papers (2020-04-22T15:56:31Z)
Training of Quantized Deep Neural Networks using a Magnetic Tunnel Junction-Based Synapse [23.08163992580639]
Quantized neural networks (QNNs) are being actively researched as a solution for the computational complexity and memory intensity of deep neural networks. We show how magnetic tunnel junction (MTJ) devices can be used to support QNN training. We introduce a novel synapse circuit that uses the MTJ behavior to support the quantize update.
arXiv Detail & Related papers (2019-12-29T11:36:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.