Related papers: Efficient Hardware Acceleration of Sparsely Active Convolutional Spiking Neural Networks

Efficient Hardware Acceleration of Sparsely Active Convolutional Spiking Neural Networks

URL: http://arxiv.org/abs/2203.12437v1
Date: Wed, 23 Mar 2022 14:18:58 GMT
Title: Efficient Hardware Acceleration of Sparsely Active Convolutional Spiking Neural Networks
Authors: Jan Sommer, M. Akif \"Ozkan, Oliver Keszocze, J\"urgen Teich
Abstract summary: Spiking Neural Networks (SNNs) compute in an event-based matter to achieve a more efficient computation than standard Neural Networks. We propose a novel architecture that is optimized for the processing of Convolutional SNNs that feature a high degree of activation sparsity.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Spiking Neural Networks (SNNs) compute in an event-based matter to achieve a more efficient computation than standard Neural Networks. In SNNs, neuronal outputs (i.e. activations) are not encoded with real-valued activations but with sequences of binary spikes. The motivation of using SNNs over conventional neural networks is rooted in the special computational aspects of SNNs, especially the very high degree of sparsity of neural output activations. Well established architectures for conventional Convolutional Neural Networks (CNNs) feature large spatial arrays of Processing Elements (PEs) that remain highly underutilized in the face of activation sparsity. We propose a novel architecture that is optimized for the processing of Convolutional SNNs (CSNNs) that feature a high degree of activation sparsity. In our architecture, the main strategy is to use less but highly utilized PEs. The PE array used to perform the convolution is only as large as the kernel size, allowing all PEs to be active as long as there are spikes to process. This constant flow of spikes is ensured by compressing the feature maps (i.e. the activations) into queues that can then be processed spike by spike. This compression is performed in run-time using dedicated circuitry, leading to a self-timed scheduling. This allows the processing time to scale directly with the number of spikes. A novel memory organization scheme called memory interlacing is used to efficiently store and retrieve the membrane potentials of the individual neurons using multiple small parallel on-chip RAMs. Each RAM is hardwired to its PE, reducing switching circuitry and allowing RAMs to be located in close proximity to the respective PE. We implemented the proposed architecture on an FPGA and achieved a significant speedup compared to other implementations while needing less hardware resources and maintaining a lower energy consumption.

Related papers

Activation-wise Propagation: A Universal Strategy to Break Timestep Constraints in Spiking Neural Networks for 3D Data Processing [29.279985043923386]
We introduce Activation-wise Membrane Potential Propagation (AMP2), a novel state update mechanism for spiking neurons. Inspired by skip connections in deep networks, AMP2 incorporates the membrane potential of neurons into network, eliminating the need for iterative updates. Our method achieves significant improvements across various 3D modalities, including 3D point clouds and event streams.
arXiv Detail & Related papers (2025-02-18T11:52:25Z)
Spiking Neural Network Accelerator Architecture for Differential-Time Representation using Learned Encoding [0.3749861135832073]
Spiking Neural Networks (SNNs) have garnered attention over recent years due to their increased energy efficiency. Two important questions when implementing SNNs are how to best encode existing data into spike trains and how to efficiently process these spike trains in hardware. This paper addresses both of these problems by incorporating the encoding into the learning process, thus allowing the network to learn the spike encoding alongside the weights.
arXiv Detail & Related papers (2025-01-14T09:09:08Z)
Optimal Gradient Checkpointing for Sparse and Recurrent Architectures using Off-Chip Memory [0.8321953606016751]
We introduce memory-efficient gradient checkpointing strategies tailored for the general class of sparse RNNs and Spiking Neural Networks. We find that Double Checkpointing emerges as the most effective method, optimizing the use of local memory resources while minimizing recomputation overhead.
arXiv Detail & Related papers (2024-12-16T14:23:31Z)
Fully Spiking Actor Network with Intra-layer Connections for Reinforcement Learning [51.386945803485084]
We focus on the task where the agent needs to learn multi-dimensional deterministic policies to control. Most existing spike-based RL methods take the firing rate as the output of SNNs, and convert it to represent continuous action space (i.e., the deterministic policy) through a fully-connected layer. To develop a fully spiking actor network without any floating-point matrix operations, we draw inspiration from the non-spiking interneurons found in insects.
arXiv Detail & Related papers (2024-01-09T07:31:34Z)
Are SNNs Truly Energy-efficient? $-$ A Hardware Perspective [7.539212567508529]
Spiking Neural Networks (SNNs) have gained attention for their energy-efficient machine learning capabilities. This work studies two hardware benchmarking platforms for large-scale SNN inference, namely SATA and SpikeSim.
arXiv Detail & Related papers (2023-09-06T22:23:22Z)
FireFly: A High-Throughput Hardware Accelerator for Spiking Neural Networks with Efficient DSP and Memory Optimization [6.966706170499345]
Spiking neural networks (SNNs) have been widely used due to their strong biological interpretability and high energy efficiency. Most SNN hardware implementations for field-programmable gate arrays (FPGAs) cannot meet arithmetic or memory efficiency requirements. We propose an FPGA accelerator that can process spikes generated by the firing neuron on-the-fly (FireFly)
arXiv Detail & Related papers (2023-01-05T04:28:07Z)
Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency. We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z)
A Resource-efficient Spiking Neural Network Accelerator Supporting Emerging Neural Encoding [6.047137174639418]
Spiking neural networks (SNNs) recently gained momentum due to their low-power multiplication-free computing. SNNs require very long spike trains (up to 1000) to reach an accuracy similar to their artificial neural network (ANN) counterparts for large models. We present a novel hardware architecture that can efficiently support SNN with emerging neural encoding.
arXiv Detail & Related papers (2022-06-06T10:56:25Z)
FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task. The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources. It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z)
Training Energy-Efficient Deep Spiking Neural Networks with Single-Spike Hybrid Input Encoding [5.725845886457027]
Spiking Neural Networks (SNNs) provide higher computational efficiency in event driven neuromorphic hardware. SNNs suffer from high inference latency, resulting from inefficient input encoding and training techniques. This paper presents a training framework for low-latency energy-efficient SNNs.
arXiv Detail & Related papers (2021-07-26T06:16:40Z)
Quantized Neural Networks via {-1, +1} Encoding Decomposition and Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks. We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z)
ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training [68.63354877166756]
ActNN is a memory-efficient training framework that stores randomly quantized activations for back propagation. ActNN reduces the memory footprint of the activation by 12x, and it enables training with a 6.6x to 14x larger batch size.
arXiv Detail & Related papers (2021-04-29T05:50:54Z)
Optimizing Memory Placement using Evolutionary Graph Reinforcement Learning [56.83172249278467]
We introduce Evolutionary Graph Reinforcement Learning (EGRL), a method designed for large search spaces. We train and validate our approach directly on the Intel NNP-I chip for inference. We additionally achieve 28-78% speed-up compared to the native NNP-I compiler on all three workloads.
arXiv Detail & Related papers (2020-07-14T18:50:12Z)
A Spike in Performance: Training Hybrid-Spiking Neural Networks with Quantized Activation Functions [6.574517227976925]
Spiking Neural Network (SNN) is a promising approach to energy-efficient computing. We show how to maintain state-of-the-art accuracy when converting a non-spiking network into an SNN.
arXiv Detail & Related papers (2020-02-10T05:24:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.