Related papers: DeepFire2: A Convolutional Spiking Neural Network Accelerator on FPGAs

DeepFire2: A Convolutional Spiking Neural Network Accelerator on FPGAs

URL: http://arxiv.org/abs/2305.05187v1
Date: Tue, 9 May 2023 05:46:07 GMT
Title: DeepFire2: A Convolutional Spiking Neural Network Accelerator on FPGAs
Authors: Myat Thu Linn Aung, Daniel Gerlinghoff, Chuping Qu, Liwei Yang, Tian Huang, Rick Siow Mong Goh, Tao Luo, Weng-Fai Wong
Abstract summary: Brain-inspired spiking neural networks (SNNs) replace the multiply-accumulate operations of traditional neural networks by integrate-and-fire neurons. DeepFire2 introduces a hardware architecture which can map large network layers efficiently across multiple super logic regions in a multi-die FPGA.
Score: 8.275598040331227
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Brain-inspired spiking neural networks (SNNs) replace the multiply-accumulate operations of traditional neural networks by integrate-and-fire neurons, with the goal of achieving greater energy efficiency. Specialized hardware implementations of those neurons clearly have advantages over general-purpose devices in terms of power and performance, but exhibit poor scalability when it comes to accelerating large neural networks. DeepFire2 introduces a hardware architecture which can map large network layers efficiently across multiple super logic regions in a multi-die FPGA. That gives more control over resource allocation and parallelism, benefiting both throughput and energy consumption. Avoiding the use of lookup tables to implement the AND operations of an SNN, prevents the layer size to be limited by logic resources. A deep pipeline does not only lead to an increased clock speed of up to 600 MHz. We double the throughput and power efficiency compared to our previous version of DeepFire, which equates to an almost 10-fold improvement over other previous implementations. Importantly, we are able to deploy a large ImageNet model, while maintaining a throughput of over 1500 frames per second.

Related papers

Spiker+: a framework for the generation of efficient Spiking Neural Networks FPGA accelerators for inference at the edge [49.42371633618761]
Spiker+ is a framework for generating efficient, low-power, and low-area customized Spiking Neural Networks (SNN) accelerators on FPGA for inference at the edge. Spiker+ is tested on two benchmark datasets, the MNIST and the Spiking Heidelberg Digits (SHD)
arXiv Detail & Related papers (2024-01-02T10:42:42Z)
FireFly v2: Advancing Hardware Support for High-Performance Spiking Neural Network with a Spatiotemporal FPGA Accelerator [8.0611988136866]
Spiking Neural Networks (SNNs) are expected to be a promising alternative to Artificial Neural Networks (ANNs) Specialized SNN hardware offers clear advantages over general-purpose devices in terms of power and performance. FireFly v2, an FPGA SNN accelerator, can address the issue of non-spike operation in current SOTA SNN algorithms.
arXiv Detail & Related papers (2023-09-28T04:17:02Z)
FireFly: A High-Throughput Hardware Accelerator for Spiking Neural Networks with Efficient DSP and Memory Optimization [6.966706170499345]
Spiking neural networks (SNNs) have been widely used due to their strong biological interpretability and high energy efficiency. Most SNN hardware implementations for field-programmable gate arrays (FPGAs) cannot meet arithmetic or memory efficiency requirements. We propose an FPGA accelerator that can process spikes generated by the firing neuron on-the-fly (FireFly)
arXiv Detail & Related papers (2023-01-05T04:28:07Z)
Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design [66.39546326221176]
Attention-based neural networks have become pervasive in many AI tasks. The use of the attention mechanism and feed-forward network (FFN) demands excessive computational and memory resources. This paper proposes a hardware-friendly variant that adopts a unified butterfly sparsity pattern to approximate both the attention mechanism and the FFNs.
arXiv Detail & Related papers (2022-09-20T09:28:26Z)
A Resource-efficient Spiking Neural Network Accelerator Supporting Emerging Neural Encoding [6.047137174639418]
Spiking neural networks (SNNs) recently gained momentum due to their low-power multiplication-free computing. SNNs require very long spike trains (up to 1000) to reach an accuracy similar to their artificial neural network (ANN) counterparts for large models. We present a novel hardware architecture that can efficiently support SNN with emerging neural encoding.
arXiv Detail & Related papers (2022-06-06T10:56:25Z)
Efficient Hardware Acceleration of Sparsely Active Convolutional Spiking Neural Networks [0.0]
Spiking Neural Networks (SNNs) compute in an event-based matter to achieve a more efficient computation than standard Neural Networks. We propose a novel architecture that is optimized for the processing of Convolutional SNNs that feature a high degree of activation sparsity.
arXiv Detail & Related papers (2022-03-23T14:18:58Z)
Two Sparsities Are Better Than One: Unlocking the Performance Benefits of Sparse-Sparse Networks [0.0]
We introduce Complementary Sparsity, a technique that significantly improves the performance of dual sparse networks on existing hardware. We show up to 100X improvement in throughput and energy efficiency performing inference on FPGAs. Our results suggest that weight plus activation sparsity can be a potent combination for efficiently scaling future AI models.
arXiv Detail & Related papers (2021-12-27T20:41:01Z)
Quantized Neural Networks via {-1, +1} Encoding Decomposition and Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks. We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z)
AdderNet and its Minimalist Hardware Design for Energy-Efficient Artificial Intelligence [111.09105910265154]
We present a novel minimalist hardware architecture using adder convolutional neural network (AdderNet) The whole AdderNet can practically achieve 16% enhancement in speed. We conclude the AdderNet is able to surpass all the other competitors.
arXiv Detail & Related papers (2021-01-25T11:31:52Z)
Binary Graph Neural Networks [69.51765073772226]
Graph Neural Networks (GNNs) have emerged as a powerful and flexible framework for representation learning on irregular data. In this paper, we present and evaluate different strategies for the binarization of graph neural networks. We show that through careful design of the models, and control of the training process, binary graph neural networks can be trained at only a moderate cost in accuracy on challenging benchmarks.
arXiv Detail & Related papers (2020-12-31T18:48:58Z)
ShiftAddNet: A Hardware-Inspired Deep Network [87.18216601210763]
ShiftAddNet is an energy-efficient multiplication-less deep neural network. It leads to both energy-efficient inference and training, without compromising expressive capacity. ShiftAddNet aggressively reduces over 80% hardware-quantified energy cost of DNNs training and inference, while offering comparable or better accuracies.
arXiv Detail & Related papers (2020-10-24T05:09:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.