Efficient Hardware Acceleration of Sparsely Active Convolutional Spiking
Neural Networks
- URL: http://arxiv.org/abs/2203.12437v1
- Date: Wed, 23 Mar 2022 14:18:58 GMT
- Title: Efficient Hardware Acceleration of Sparsely Active Convolutional Spiking
Neural Networks
- Authors: Jan Sommer, M. Akif \"Ozkan, Oliver Keszocze, J\"urgen Teich
- Abstract summary: Spiking Neural Networks (SNNs) compute in an event-based matter to achieve a more efficient computation than standard Neural Networks.
We propose a novel architecture that is optimized for the processing of Convolutional SNNs that feature a high degree of activation sparsity.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Spiking Neural Networks (SNNs) compute in an event-based matter to achieve a
more efficient computation than standard Neural Networks. In SNNs, neuronal
outputs (i.e. activations) are not encoded with real-valued activations but
with sequences of binary spikes. The motivation of using SNNs over conventional
neural networks is rooted in the special computational aspects of SNNs,
especially the very high degree of sparsity of neural output activations. Well
established architectures for conventional Convolutional Neural Networks (CNNs)
feature large spatial arrays of Processing Elements (PEs) that remain highly
underutilized in the face of activation sparsity. We propose a novel
architecture that is optimized for the processing of Convolutional SNNs (CSNNs)
that feature a high degree of activation sparsity. In our architecture, the
main strategy is to use less but highly utilized PEs. The PE array used to
perform the convolution is only as large as the kernel size, allowing all PEs
to be active as long as there are spikes to process. This constant flow of
spikes is ensured by compressing the feature maps (i.e. the activations) into
queues that can then be processed spike by spike. This compression is performed
in run-time using dedicated circuitry, leading to a self-timed scheduling. This
allows the processing time to scale directly with the number of spikes. A novel
memory organization scheme called memory interlacing is used to efficiently
store and retrieve the membrane potentials of the individual neurons using
multiple small parallel on-chip RAMs. Each RAM is hardwired to its PE, reducing
switching circuitry and allowing RAMs to be located in close proximity to the
respective PE. We implemented the proposed architecture on an FPGA and achieved
a significant speedup compared to other implementations while needing less
hardware resources and maintaining a lower energy consumption.
Related papers
- Fully Spiking Actor Network with Intra-layer Connections for
Reinforcement Learning [51.386945803485084]
We focus on the task where the agent needs to learn multi-dimensional deterministic policies to control.
Most existing spike-based RL methods take the firing rate as the output of SNNs, and convert it to represent continuous action space (i.e., the deterministic policy) through a fully-connected layer.
To develop a fully spiking actor network without any floating-point matrix operations, we draw inspiration from the non-spiking interneurons found in insects.
arXiv Detail & Related papers (2024-01-09T07:31:34Z) - Are SNNs Truly Energy-efficient? $-$ A Hardware Perspective [7.539212567508529]
Spiking Neural Networks (SNNs) have gained attention for their energy-efficient machine learning capabilities.
This work studies two hardware benchmarking platforms for large-scale SNN inference, namely SATA and SpikeSim.
arXiv Detail & Related papers (2023-09-06T22:23:22Z) - FireFly: A High-Throughput Hardware Accelerator for Spiking Neural
Networks with Efficient DSP and Memory Optimization [6.966706170499345]
Spiking neural networks (SNNs) have been widely used due to their strong biological interpretability and high energy efficiency.
Most SNN hardware implementations for field-programmable gate arrays (FPGAs) cannot meet arithmetic or memory efficiency requirements.
We propose an FPGA accelerator that can process spikes generated by the firing neuron on-the-fly (FireFly)
arXiv Detail & Related papers (2023-01-05T04:28:07Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - A Resource-efficient Spiking Neural Network Accelerator Supporting
Emerging Neural Encoding [6.047137174639418]
Spiking neural networks (SNNs) recently gained momentum due to their low-power multiplication-free computing.
SNNs require very long spike trains (up to 1000) to reach an accuracy similar to their artificial neural network (ANN) counterparts for large models.
We present a novel hardware architecture that can efficiently support SNN with emerging neural encoding.
arXiv Detail & Related papers (2022-06-06T10:56:25Z) - FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task.
The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources.
It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z) - Training Energy-Efficient Deep Spiking Neural Networks with Single-Spike
Hybrid Input Encoding [5.725845886457027]
Spiking Neural Networks (SNNs) provide higher computational efficiency in event driven neuromorphic hardware.
SNNs suffer from high inference latency, resulting from inefficient input encoding and training techniques.
This paper presents a training framework for low-latency energy-efficient SNNs.
arXiv Detail & Related papers (2021-07-26T06:16:40Z) - Quantized Neural Networks via {-1, +1} Encoding Decomposition and
Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks.
We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z) - ActNN: Reducing Training Memory Footprint via 2-Bit Activation
Compressed Training [68.63354877166756]
ActNN is a memory-efficient training framework that stores randomly quantized activations for back propagation.
ActNN reduces the memory footprint of the activation by 12x, and it enables training with a 6.6x to 14x larger batch size.
arXiv Detail & Related papers (2021-04-29T05:50:54Z) - Optimizing Memory Placement using Evolutionary Graph Reinforcement
Learning [56.83172249278467]
We introduce Evolutionary Graph Reinforcement Learning (EGRL), a method designed for large search spaces.
We train and validate our approach directly on the Intel NNP-I chip for inference.
We additionally achieve 28-78% speed-up compared to the native NNP-I compiler on all three workloads.
arXiv Detail & Related papers (2020-07-14T18:50:12Z) - A Spike in Performance: Training Hybrid-Spiking Neural Networks with
Quantized Activation Functions [6.574517227976925]
Spiking Neural Network (SNN) is a promising approach to energy-efficient computing.
We show how to maintain state-of-the-art accuracy when converting a non-spiking network into an SNN.
arXiv Detail & Related papers (2020-02-10T05:24:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.