E3NE: An End-to-End Framework for Accelerating Spiking Neural Networks
with Emerging Neural Encoding on FPGAs
- URL: http://arxiv.org/abs/2111.10027v1
- Date: Fri, 19 Nov 2021 04:01:19 GMT
- Title: E3NE: An End-to-End Framework for Accelerating Spiking Neural Networks
with Emerging Neural Encoding on FPGAs
- Authors: Daniel Gerlinghoff, Zhehui Wang, Xiaozhe Gu, Rick Siow Mong Goh, Tao
Luo
- Abstract summary: End-to-end framework E3NE automates the generation of efficient SNN inference logic for FPGAs.
E3NE uses less than 50% of hardware resources and 20% less power, while reducing the latency by an order of magnitude.
- Score: 6.047137174639418
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Compiler frameworks are crucial for the widespread use of FPGA-based deep
learning accelerators. They allow researchers and developers, who are not
familiar with hardware engineering, to harness the performance attained by
domain-specific logic. There exists a variety of frameworks for conventional
artificial neural networks. However, not much research effort has been put into
the creation of frameworks optimized for spiking neural networks (SNNs). This
new generation of neural networks becomes increasingly interesting for the
deployment of AI on edge devices, which have tight power and resource
constraints. Our end-to-end framework E3NE automates the generation of
efficient SNN inference logic for FPGAs. Based on a PyTorch model and user
parameters, it applies various optimizations and assesses trade-offs inherent
to spike-based accelerators. Multiple levels of parallelism and the use of an
emerging neural encoding scheme result in an efficiency superior to previous
SNN hardware implementations. For a similar model, E3NE uses less than 50% of
hardware resources and 20% less power, while reducing the latency by an order
of magnitude. Furthermore, scalability and generality allowed the deployment of
the large-scale SNN models AlexNet and VGG.
Related papers
- RNC: Efficient RRAM-aware NAS and Compilation for DNNs on Resource-Constrained Edge Devices [0.30458577208819987]
We aim to develop edge-friendly deep neural networks (DNNs) for accelerators based on resistive random-access memory (RRAM)
We propose an edge compilation and resource-constrained RRAM-aware neural architecture search (NAS) framework to search for optimized neural networks meeting specific hardware constraints.
The resulting model from NAS optimized for speed achieved 5x-30x speedup.
arXiv Detail & Related papers (2024-09-27T15:35:36Z) - Spyx: A Library for Just-In-Time Compiled Optimization of Spiking Neural
Networks [0.08965418284317034]
Spiking Neural Networks (SNNs) offer to enhance energy efficiency through a reduced and low-power hardware footprint.
This paper introduces Spyx, a new and lightweight SNN simulation and optimization library designed in JAX.
arXiv Detail & Related papers (2024-02-29T09:46:44Z) - SpikingJelly: An open-source machine learning infrastructure platform
for spike-based intelligence [51.6943465041708]
Spiking neural networks (SNNs) aim to realize brain-inspired intelligence on neuromorphic chips with high energy efficiency.
We contribute a full-stack toolkit for pre-processing neuromorphic datasets, building deep SNNs, optimizing their parameters, and deploying SNNs on neuromorphic chips.
arXiv Detail & Related papers (2023-10-25T13:15:17Z) - A Resource-efficient Spiking Neural Network Accelerator Supporting
Emerging Neural Encoding [6.047137174639418]
Spiking neural networks (SNNs) recently gained momentum due to their low-power multiplication-free computing.
SNNs require very long spike trains (up to 1000) to reach an accuracy similar to their artificial neural network (ANN) counterparts for large models.
We present a novel hardware architecture that can efficiently support SNN with emerging neural encoding.
arXiv Detail & Related papers (2022-06-06T10:56:25Z) - FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task.
The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources.
It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z) - An Adaptive Device-Edge Co-Inference Framework Based on Soft
Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices.
We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations.
Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z) - Sub-bit Neural Networks: Learning to Compress and Accelerate Binary
Neural Networks [72.81092567651395]
Sub-bit Neural Networks (SNNs) are a new type of binary quantization design tailored to compress and accelerate BNNs.
SNNs are trained with a kernel-aware optimization framework, which exploits binary quantization in the fine-grained convolutional kernel space.
Experiments on visual recognition benchmarks and the hardware deployment on FPGA validate the great potentials of SNNs.
arXiv Detail & Related papers (2021-10-18T11:30:29Z) - Quantized Neural Networks via {-1, +1} Encoding Decomposition and
Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks.
We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z) - Learning on Hardware: A Tutorial on Neural Network Accelerators and
Co-Processors [0.0]
Deep neural networks (DNNs) have the advantage that they can take into account a large number of parameters, which enables them to solve complex tasks.
In computer vision and speech recognition, they have a better accuracy than common algorithms, and in some tasks, they boast an even higher accuracy than human experts.
With the progress of DNNs in recent years, many other fields of application such as diagnosis of diseases and autonomous driving are taking advantage of them.
arXiv Detail & Related papers (2021-04-19T12:50:27Z) - PatDNN: Achieving Real-Time DNN Execution on Mobile Devices with
Pattern-based Weight Pruning [57.20262984116752]
We introduce a new dimension, fine-grained pruning patterns inside the coarse-grained structures, revealing a previously unknown point in design space.
With the higher accuracy enabled by fine-grained pruning patterns, the unique insight is to use the compiler to re-gain and guarantee high hardware efficiency.
arXiv Detail & Related papers (2020-01-01T04:52:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.