SparkXD: A Framework for Resilient and Energy-Efficient Spiking Neural
Network Inference using Approximate DRAM
- URL: http://arxiv.org/abs/2103.00421v1
- Date: Sun, 28 Feb 2021 08:12:26 GMT
- Title: SparkXD: A Framework for Resilient and Energy-Efficient Spiking Neural
Network Inference using Approximate DRAM
- Authors: Rachmad Vidya Wicaksana Putra, Muhammad Abdullah Hanif, Muhammad
Shafique
- Abstract summary: Spiking Neural Networks (SNNs) have the potential for achieving low energy consumption due to their biologically sparse computation.
Several studies have shown that the off-chip memory (DRAM) accesses are the most energy-consuming operations in SNN processing.
We propose SparkXD, a novel framework that provides a comprehensive conjoint solution for resilient and energy-efficient SNN inference.
- Score: 15.115813664357436
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Spiking Neural Networks (SNNs) have the potential for achieving low energy
consumption due to their biologically sparse computation. Several studies have
shown that the off-chip memory (DRAM) accesses are the most energy-consuming
operations in SNN processing. However, state-of-the-art in SNN systems do not
optimize the DRAM energy-per-access, thereby hindering achieving high
energy-efficiency. To substantially minimize the DRAM energy-per-access, a key
knob is to reduce the DRAM supply voltage but this may lead to DRAM errors
(i.e., the so-called approximate DRAM). Towards this, we propose SparkXD, a
novel framework that provides a comprehensive conjoint solution for resilient
and energy-efficient SNN inference using low-power DRAMs subjected to
voltage-induced errors. The key mechanisms of SparkXD are: (1) improving the
SNN error tolerance through fault-aware training that considers bit errors from
approximate DRAM, (2) analyzing the error tolerance of the improved SNN model
to find the maximum tolerable bit error rate (BER) that meets the targeted
accuracy constraint, and (3) energy-efficient DRAM data mapping for the
resilient SNN model that maps the weights in the appropriate DRAM location to
minimize the DRAM access energy. Through these mechanisms, SparkXD mitigates
the negative impact of DRAM (approximation) errors, and provides the required
accuracy. The experimental results show that, for a target accuracy within 1%
of the baseline design (i.e., SNN without DRAM errors), SparkXD reduces the
DRAM energy by ca. 40% on average across different network sizes.
Related papers
- Enabling Efficient and Scalable DRAM Read Disturbance Mitigation via New Experimental Insights into Modern DRAM Chips [0.0]
Storage density exacerbates DRAM read disturbance, a circuit-level vulnerability exploited by system-level attacks.
Existing defenses are either ineffective or prohibitively expensive.
This dissertation tackles two problems: 1) protecting DRAM-based systems becomes more expensive as technology scaling increases read disturbance vulnerability, and 2) many existing solutions depend on proprietary knowledge of DRAM internals.
arXiv Detail & Related papers (2024-08-27T13:12:03Z) - PENDRAM: Enabling High-Performance and Energy-Efficient Processing of Deep Neural Networks through a Generalized DRAM Data Mapping Policy [6.85785397160228]
Convolutional Neural Networks (CNNs) have emerged as a state-of-the-art solution for solving machine learning tasks.
CNN accelerators face performance- and energy-efficiency challenges due to high off-chip memory (DRAM) access latency and energy.
We present PENDRAM, a novel design space exploration methodology that enables high-performance and energy-efficient CNN acceleration.
arXiv Detail & Related papers (2024-08-05T12:11:09Z) - NeuralFuse: Learning to Recover the Accuracy of Access-Limited Neural
Network Inference in Low-Voltage Regimes [52.51014498593644]
Deep neural networks (DNNs) have become ubiquitous in machine learning, but their energy consumption remains a notable issue.
We introduce NeuralFuse, a novel add-on module that addresses the accuracy-energy tradeoff in low-voltage regimes.
At a 1% bit error rate, NeuralFuse can reduce memory access energy by up to 24% while recovering accuracy by up to 57%.
arXiv Detail & Related papers (2023-06-29T11:38:22Z) - EnforceSNN: Enabling Resilient and Energy-Efficient Spiking Neural
Network Inference considering Approximate DRAMs for Embedded Systems [15.115813664357436]
Spiking Neural Networks (SNNs) have shown capabilities of achieving high accuracy under unsupervised settings and low operational power/energy.
We propose EnforceSNN, a novel design framework that provides a solution for resilient and energy-efficient SNN inference using reduced-voltage DRAM.
arXiv Detail & Related papers (2023-04-08T15:15:11Z) - Distribution-sensitive Information Retention for Accurate Binary Neural
Network [49.971345958676196]
We present a novel Distribution-sensitive Information Retention Network (DIR-Net) to retain the information of the forward activations and backward gradients.
Our DIR-Net consistently outperforms the SOTA binarization approaches under mainstream and compact architectures.
We conduct our DIR-Net on real-world resource-limited devices which achieves 11.1 times storage saving and 5.4 times speedup.
arXiv Detail & Related papers (2021-09-25T10:59:39Z) - Random and Adversarial Bit Error Robustness: Energy-Efficient and Secure
DNN Accelerators [105.60654479548356]
We show that a combination of robust fixed-point quantization, weight clipping, as well as random bit error training (RandBET) improves robustness against random or adversarial bit errors in quantized DNN weights significantly.
This leads to high energy savings for low-voltage operation as well as low-precision quantization, but also improves security of DNN accelerators.
arXiv Detail & Related papers (2021-04-16T19:11:14Z) - SmartDeal: Re-Modeling Deep Network Weights for Efficient Inference and
Training [82.35376405568975]
Deep neural networks (DNNs) come with heavy parameterization, leading to external dynamic random-access memory (DRAM) for storage.
We present SmartDeal (SD), an algorithm framework to trade higher-cost memory storage/access for lower-cost computation.
We show that SD leads to 10.56x and 4.48x reduction in the storage and training energy, with negligible accuracy loss compared to state-of-the-art training baselines.
arXiv Detail & Related papers (2021-01-04T18:54:07Z) - Bit Error Robustness for Energy-Efficient DNN Accelerators [93.58572811484022]
We show that a combination of robust fixed-point quantization, weight clipping, and random bit error training (RandBET) improves robustness against random bit errors.
This leads to high energy savings from both low-voltage operation as well as low-precision quantization.
arXiv Detail & Related papers (2020-06-24T18:23:10Z) - SmartExchange: Trading Higher-cost Memory Storage/Access for Lower-cost
Computation [97.78417228445883]
We present SmartExchange, an algorithm- hardware co-design framework for energy-efficient inference of deep neural networks (DNNs)
We develop a novel algorithm to enforce a specially favorable DNN weight structure, where each layerwise weight matrix can be stored as the product of a small basis matrix and a large sparse coefficient matrix whose non-zero elements are all power-of-2.
We further design a dedicated accelerator to fully utilize the SmartExchange-enforced weights to improve both energy efficiency and latency performance.
arXiv Detail & Related papers (2020-05-07T12:12:49Z) - DRMap: A Generic DRAM Data Mapping Policy for Energy-Efficient
Processing of Convolutional Neural Networks [15.115813664357436]
We study the latency and energy of different mapping policies on different DRAM architectures.
The results show that the energy-efficient DRAM accesses can be achieved by a mapping policy that orderly prioritizes to maximize the row buffer hits, bank- and subarray-level parallelism.
arXiv Detail & Related papers (2020-04-21T23:26:23Z) - Data-Driven Neuromorphic DRAM-based CNN and RNN Accelerators [13.47462920292399]
The energy consumed by running large deep neural networks (DNNs) on hardware accelerators is dominated by the need for lots of fast memory to store both states and weights.
Although DRAM is high-cost and low-cost memory (costing 20X less than DRAM), its long random access latency is bad for the unpredictable access patterns in spiking neural networks (SNNs)
This paper reports on our developments over the last 5 years of convolutional and recurrent deep neural network hardware accelerators that exploit either spatial or temporal sparsity similar to SNNs but achieve SOA throughput, power efficiency and latency even with the use
arXiv Detail & Related papers (2020-03-29T11:45:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.