FireFly v2: Advancing Hardware Support for High-Performance Spiking
Neural Network with a Spatiotemporal FPGA Accelerator
- URL: http://arxiv.org/abs/2309.16158v1
- Date: Thu, 28 Sep 2023 04:17:02 GMT
- Title: FireFly v2: Advancing Hardware Support for High-Performance Spiking
Neural Network with a Spatiotemporal FPGA Accelerator
- Authors: Jindong Li, Guobin Shen, Dongcheng Zhao, Qian Zhang, Yi Zeng
- Abstract summary: Spiking Neural Networks (SNNs) are expected to be a promising alternative to Artificial Neural Networks (ANNs)
Specialized SNN hardware offers clear advantages over general-purpose devices in terms of power and performance.
FireFly v2, an FPGA SNN accelerator, can address the issue of non-spike operation in current SOTA SNN algorithms.
- Score: 8.0611988136866
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Spiking Neural Networks (SNNs) are expected to be a promising alternative to
Artificial Neural Networks (ANNs) due to their strong biological
interpretability and high energy efficiency. Specialized SNN hardware offers
clear advantages over general-purpose devices in terms of power and
performance. However, there's still room to advance hardware support for
state-of-the-art (SOTA) SNN algorithms and improve computation and memory
efficiency. As a further step in supporting high-performance SNNs on
specialized hardware, we introduce FireFly v2, an FPGA SNN accelerator that can
address the issue of non-spike operation in current SOTA SNN algorithms, which
presents an obstacle in the end-to-end deployment onto existing SNN hardware.
To more effectively align with the SNN characteristics, we design a
spatiotemporal dataflow that allows four dimensions of parallelism and
eliminates the need for membrane potential storage, enabling on-the-fly spike
processing and spike generation. To further improve hardware acceleration
performance, we develop a high-performance spike computing engine as a backend
based on a systolic array operating at 500-600MHz. To the best of our
knowledge, FireFly v2 achieves the highest clock frequency among all FPGA-based
implementations. Furthermore, it stands as the first SNN accelerator capable of
supporting non-spike operations, which are commonly used in advanced SNN
algorithms. FireFly v2 has doubled the throughput and DSP efficiency when
compared to our previous version of FireFly and it exhibits 1.33 times the DSP
efficiency and 1.42 times the power efficiency compared to the current most
advanced FPGA accelerators.
Related papers
- Hardware-Software Co-optimised Fast and Accurate Deep Reconfigurable Spiking Inference Accelerator Architecture Design Methodology [2.968768532937366]
Spiking Neural Networks (SNNs) have emerged as a promising approach to improve the energy efficiency of machine learning models.
We develop a hardware-software co-optimisation strategy to port software-trained deep neural networks (DNN) to reduced-precision spiking models.
arXiv Detail & Related papers (2024-10-07T05:04:13Z) - Spiker+: a framework for the generation of efficient Spiking Neural
Networks FPGA accelerators for inference at the edge [49.42371633618761]
Spiker+ is a framework for generating efficient, low-power, and low-area customized Spiking Neural Networks (SNN) accelerators on FPGA for inference at the edge.
Spiker+ is tested on two benchmark datasets, the MNIST and the Spiking Heidelberg Digits (SHD)
arXiv Detail & Related papers (2024-01-02T10:42:42Z) - DeepFire2: A Convolutional Spiking Neural Network Accelerator on FPGAs [8.275598040331227]
Brain-inspired spiking neural networks (SNNs) replace the multiply-accumulate operations of traditional neural networks by integrate-and-fire neurons.
DeepFire2 introduces a hardware architecture which can map large network layers efficiently across multiple super logic regions in a multi-die FPGA.
arXiv Detail & Related papers (2023-05-09T05:46:07Z) - End-to-end codesign of Hessian-aware quantized neural networks for FPGAs
and ASICs [49.358119307844035]
We develop an end-to-end workflow for the training and implementation of co-designed neural networks (NNs)
This makes efficient NN implementations in hardware accessible to nonexperts, in a single open-sourced workflow.
We demonstrate the workflow in a particle physics application involving trigger decisions that must operate at the 40 MHz collision rate of the Large Hadron Collider (LHC)
We implement an optimized mixed-precision NN for high-momentum particle jets in simulated LHC proton-proton collisions.
arXiv Detail & Related papers (2023-04-13T18:00:01Z) - FireFly: A High-Throughput Hardware Accelerator for Spiking Neural
Networks with Efficient DSP and Memory Optimization [6.966706170499345]
Spiking neural networks (SNNs) have been widely used due to their strong biological interpretability and high energy efficiency.
Most SNN hardware implementations for field-programmable gate arrays (FPGAs) cannot meet arithmetic or memory efficiency requirements.
We propose an FPGA accelerator that can process spikes generated by the firing neuron on-the-fly (FireFly)
arXiv Detail & Related papers (2023-01-05T04:28:07Z) - Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and
Algorithm Co-design [66.39546326221176]
Attention-based neural networks have become pervasive in many AI tasks.
The use of the attention mechanism and feed-forward network (FFN) demands excessive computational and memory resources.
This paper proposes a hardware-friendly variant that adopts a unified butterfly sparsity pattern to approximate both the attention mechanism and the FFNs.
arXiv Detail & Related papers (2022-09-20T09:28:26Z) - FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task.
The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources.
It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z) - Sub-bit Neural Networks: Learning to Compress and Accelerate Binary
Neural Networks [72.81092567651395]
Sub-bit Neural Networks (SNNs) are a new type of binary quantization design tailored to compress and accelerate BNNs.
SNNs are trained with a kernel-aware optimization framework, which exploits binary quantization in the fine-grained convolutional kernel space.
Experiments on visual recognition benchmarks and the hardware deployment on FPGA validate the great potentials of SNNs.
arXiv Detail & Related papers (2021-10-18T11:30:29Z) - SECDA: Efficient Hardware/Software Co-Design of FPGA-based DNN
Accelerators for Edge Inference [0.0]
We propose SECDA, a new hardware/software co-design methodology to reduce design time of optimized Deep Neural Networks (DNN) inference accelerators on edge devices with FPGAs.
We use SECDA to efficiently develop two different DNN accelerator designs on a PYNQ-Z1 board, a platform that includes an edge FPGA.
We evaluate the two accelerator designs with four common DNN models, achieving an average performance speedup across models of up to 3.5$times$ with a 2.9$times$ reduction in energy consumption over CPU-only inference.
arXiv Detail & Related papers (2021-10-01T15:20:29Z) - You Only Spike Once: Improving Energy-Efficient Neuromorphic Inference
to ANN-Level Accuracy [51.861168222799186]
Spiking Neural Networks (SNNs) are a type of neuromorphic, or brain-inspired network.
SNNs are sparse, accessing very few weights, and typically only use addition operations instead of the more power-intensive multiply-and-accumulate operations.
In this work, we aim to overcome the limitations of TTFS-encoded neuromorphic systems.
arXiv Detail & Related papers (2020-06-03T15:55:53Z) - DNN-Chip Predictor: An Analytical Performance Predictor for DNN
Accelerators with Various Dataflows and Hardware Architectures [30.689015188050405]
The recent breakthroughs in deep neural networks (DNNs) have spurred a tremendously increased demand for DNN accelerators.
DNN-Chip Predictor is an analytical performance predictor which can accurately predict DNN accelerators' energy, throughput, and latency prior to their actual implementation.
arXiv Detail & Related papers (2020-02-26T02:59:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.