Related papers: Memory Organization for Energy-Efficient Learning and Inference in Digital Neuromorphic Accelerators

Memory Organization for Energy-Efficient Learning and Inference in Digital Neuromorphic Accelerators

URL: http://arxiv.org/abs/2003.11639v1
Date: Thu, 5 Mar 2020 19:19:09 GMT
Title: Memory Organization for Energy-Efficient Learning and Inference in Digital Neuromorphic Accelerators
Authors: Clemens JS Schaefer, Patrick Faley, Emre O Neftci, Siddharth Joshi
Abstract summary: Energy efficiency of neuromorphic hardware is greatly affected by the energy storing, accessing, and updating synaptic parameters. We introduce functional encoding for structured connectivity such as the connectivity in convolutional layers. For a 2 layer spiking neural network trained to retain atemporal-temporal pattern, bitmap (PB-BMP) based organization can encode the sparse networks more efficiently.
Score: 0.4030910640265943
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The energy efficiency of neuromorphic hardware is greatly affected by the energy of storing, accessing, and updating synaptic parameters. Various methods of memory organisation targeting energy-efficient digital accelerators have been investigated in the past, however, they do not completely encapsulate the energy costs at a system level. To address this shortcoming and to account for various overheads, we synthesize the controller and memory for different encoding schemes and extract the energy costs from these synthesized blocks. Additionally, we introduce functional encoding for structured connectivity such as the connectivity in convolutional layers. Functional encoding offers a 58% reduction in the energy to implement a backward pass and weight update in such layers compared to existing index-based solutions. We show that for a 2 layer spiking neural network trained to retain a spatio-temporal pattern, bitmap (PB-BMP) based organization can encode the sparser networks more efficiently. This form of encoding delivers a 1.37x improvement in energy efficiency coming at the cost of a 4% degradation in network retention accuracy as measured by the van Rossum distance.

Related papers

Over-the-Air Multi-Sensor Inference with Neural Networks Using Memristor-Based Analog Computing [13.5346836945515]
This study proposes a multi-sensor wireless inference system with memristor-based analog computing. Given the sensors' limited computational capabilities, the features from the network's front end are transmitted to a central device. We also introduce a trainable over-the-air sensor fusion method based on $L_p$-norm inspired combining function.
arXiv Detail & Related papers (2025-01-17T15:14:58Z)
COMPASS: A Compiler Framework for Resource-Constrained Crossbar-Array Based In-Memory Deep Learning Accelerators [6.172271429579593]
We propose a compiler framework for resource-constrained crossbar-based processing-in-memory (PIM) deep neural network (DNN) accelerators. We propose an algorithm to determine the optimal partitioning that divides the layers so that each partition can be accelerated on chip.
arXiv Detail & Related papers (2025-01-12T11:31:25Z)
A Fully Hardware Implemented Accelerator Design in ReRAM Analog Computing without ADCs [5.6496088684920345]
ReRAM-based accelerators process neural networks via analog Computing-in-Memory (CiM) for ultra-high energy efficiency. This work explores the hardware implementation of the Sigmoid and SoftMax activation functions of neural networks with crossbarally binarized neurons. We propose a complete ReRAM-based Analog Computing Accelerator (RACA) that accelerates neural network computation by leveraging inferenceally binarized neurons.
arXiv Detail & Related papers (2024-12-27T09:38:19Z)
SpiDR: A Reconfigurable Digital Compute-in-Memory Spiking Neural Network Accelerator for Event-based Perception [8.968583287058959]
Spiking Neural Networks (SNNs) offer an efficient method for processing the asynchronous temporal data generated by Dynamic Vision Sensors (DVS) Existing SNN accelerators suffer from limitations in adaptability to diverse neuron models, bit precisions and network sizes. We propose a scalable and reconfigurable digital compute-in-memory (CIM) SNN accelerator chipname with a set of key features.
arXiv Detail & Related papers (2024-11-05T06:59:02Z)
Accelerating Error Correction Code Transformers [56.75773430667148]
We introduce a novel acceleration method for transformer-based decoders. We achieve a 90% compression ratio and reduce arithmetic operation energy consumption by at least 224 times on modern hardware.
arXiv Detail & Related papers (2024-10-08T11:07:55Z)
Accelerating Depthwise Separable Convolutions on Ultra-Low-Power Devices [10.733902200950872]
We explore alternatives to fuse the depthwise and pointwise kernels that constitute the separable convolutional block. Our approach aims to minimize time-consuming memory transfers by combining different data layouts.
arXiv Detail & Related papers (2024-06-18T10:32:40Z)
Efficient and accurate neural field reconstruction using resistive memory [52.68088466453264]
Traditional signal reconstruction methods on digital computers face both software and hardware challenges. We propose a systematic approach with software-hardware co-optimizations for signal reconstruction from sparse inputs. This work advances the AI-driven signal restoration technology and paves the way for future efficient and robust medical AI and 3D vision applications.
arXiv Detail & Related papers (2024-04-15T09:33:09Z)
Resistive Memory-based Neural Differential Equation Solver for Score-based Diffusion Model [55.116403765330084]
Current AIGC methods, such as score-based diffusion, are still deficient in terms of rapidity and efficiency. We propose a time-continuous and analog in-memory neural differential equation solver for score-based diffusion. We experimentally validate our solution with 180 nm resistive memory in-memory computing macros.
arXiv Detail & Related papers (2024-04-08T16:34:35Z)
EMN: Brain-inspired Elastic Memory Network for Quick Domain Adaptive Feature Mapping [57.197694698750404]
We propose a novel gradient-free Elastic Memory Network to support quick fine-tuning of the mapping between features and prediction. EMN adopts randomly connected neurons to memorize the association of features and labels, where the signals in the network are propagated as impulses. EMN can achieve up to 10% enhancement of performance while only needing less than 1% timing cost of traditional domain adaptation methods.
arXiv Detail & Related papers (2024-02-04T09:58:17Z)
Pruning random resistive memory for optimizing analogue AI [54.21621702814583]
AI models present unprecedented challenges to energy consumption and environmental sustainability. One promising solution is to revisit analogue computing, a technique that predates digital computing. Here, we report a universal solution, software-hardware co-design using structural plasticity-inspired edge pruning.
arXiv Detail & Related papers (2023-11-13T08:59:01Z)
Energy Efficient Hardware Acceleration of Neural Networks with Power-of-Two Quantisation [0.0]
We show that a hardware neural network accelerator with PoT weights implemented on the Zynq UltraScale + MPSoC ZCU104 FPGA can be at least $1.4x$ more energy efficient than the uniform quantisation version.
arXiv Detail & Related papers (2022-09-30T06:33:40Z)
Deep Reinforcement Learning Based Multidimensional Resource Management for Energy Harvesting Cognitive NOMA Communications [64.1076645382049]
Combination of energy harvesting (EH), cognitive radio (CR), and non-orthogonal multiple access (NOMA) is a promising solution to improve energy efficiency. In this paper, we study the spectrum, energy, and time resource management for deterministic-CR-NOMA IoT systems.
arXiv Detail & Related papers (2021-09-17T08:55:48Z)
SmartDeal: Re-Modeling Deep Network Weights for Efficient Inference and Training [82.35376405568975]
Deep neural networks (DNNs) come with heavy parameterization, leading to external dynamic random-access memory (DRAM) for storage. We present SmartDeal (SD), an algorithm framework to trade higher-cost memory storage/access for lower-cost computation. We show that SD leads to 10.56x and 4.48x reduction in the storage and training energy, with negligible accuracy loss compared to state-of-the-art training baselines.
arXiv Detail & Related papers (2021-01-04T18:54:07Z)
DESCNet: Developing Efficient Scratchpad Memories for Capsule Network Hardware [12.26801463167931]
Capsule Networks (CapsNets) have improved the generalization ability, as compared to Deep Neural Networks (DNNs) CapsNets pose significantly high computational and memory requirements, making their energy-efficient inference a challenging task. This paper provides, for the first time, an in-depth analysis to highlight the design and management related challenges for the (on-chip) memories deployed in hardware accelerators executing fast CapsNets inference.
arXiv Detail & Related papers (2020-10-12T14:50:59Z)
Improving Memory Utilization in Convolutional Neural Network Accelerators [16.340620299847384]
We propose a mapping method that allows activation layers to overlap and thus utilize the memory more efficiently. Experiments with various real-world object detector networks show that the proposed mapping technique can decrease the activations memory by up to 32.9%. For higher resolution de-noising networks, we achieve activation memory savings of 48.8%.
arXiv Detail & Related papers (2020-07-20T09:34:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.