DFSynthesizer: Dataflow-based Synthesis of Spiking Neural Networks to
Neuromorphic Hardware
- URL: http://arxiv.org/abs/2108.02023v1
- Date: Wed, 4 Aug 2021 12:49:37 GMT
- Title: DFSynthesizer: Dataflow-based Synthesis of Spiking Neural Networks to
Neuromorphic Hardware
- Authors: Shihao Song, Harry Chong, Adarsha Balaji, Anup Das, James Shackleford,
Nagarajan Kandasamy
- Abstract summary: Spiking Neural Networks (SNN) are an emerging computation model, which uses event-driven activation and bio-inspired learning algorithms.
DF Synthesizer is an end-to-end framework for synthesizing SNN-based machine learning programs to neuromorphic hardware.
- Score: 4.273223677453178
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Spiking Neural Networks (SNN) are an emerging computation model, which uses
event-driven activation and bio-inspired learning algorithms. SNN-based
machine-learning programs are typically executed on tile- based neuromorphic
hardware platforms, where each tile consists of a computation unit called
crossbar, which maps neurons and synapses of the program. However, synthesizing
such programs on an off-the-shelf neuromorphic hardware is challenging. This is
because of the inherent resource and latency limitations of the hardware, which
impact both model performance, e.g., accuracy, and hardware performance, e.g.,
throughput. We propose DFSynthesizer, an end-to-end framework for synthesizing
SNN-based machine learning programs to neuromorphic hardware. The proposed
framework works in four steps. First, it analyzes a machine-learning program
and generates SNN workload using representative data. Second, it partitions the
SNN workload and generates clusters that fit on crossbars of the target
neuromorphic hardware. Third, it exploits the rich semantics of Synchronous
Dataflow Graph (SDFG) to represent a clustered SNN program, allowing for
performance analysis in terms of key hardware constraints such as number of
crossbars, dimension of each crossbar, buffer space on tiles, and tile
communication bandwidth. Finally, it uses a novel scheduling algorithm to
execute clusters on crossbars of the hardware, guaranteeing hardware
performance. We evaluate DFSynthesizer with 10 commonly used machine-learning
programs. Our results demonstrate that DFSynthesizer provides much tighter
performance guarantee compared to current mapping approaches.
Related papers
- Neuromorphic Wireless Split Computing with Multi-Level Spikes [69.73249913506042]
In neuromorphic computing, spiking neural networks (SNNs) perform inference tasks, offering significant efficiency gains for workloads involving sequential data.
Recent advances in hardware and software have demonstrated that embedding a few bits of payload in each spike exchanged between the spiking neurons can further enhance inference accuracy.
This paper investigates a wireless neuromorphic split computing architecture employing multi-level SNNs.
arXiv Detail & Related papers (2024-11-07T14:08:35Z) - RNC: Efficient RRAM-aware NAS and Compilation for DNNs on Resource-Constrained Edge Devices [0.30458577208819987]
We aim to develop edge-friendly deep neural networks (DNNs) for accelerators based on resistive random-access memory (RRAM)
We propose an edge compilation and resource-constrained RRAM-aware neural architecture search (NAS) framework to search for optimized neural networks meeting specific hardware constraints.
The resulting model from NAS optimized for speed achieved 5x-30x speedup.
arXiv Detail & Related papers (2024-09-27T15:35:36Z) - A Realistic Simulation Framework for Analog/Digital Neuromorphic Architectures [73.65190161312555]
ARCANA is a spiking neural network simulator designed to account for the properties of mixed-signal neuromorphic circuits.
We show how the results obtained provide a reliable estimate of the behavior of the spiking neural network trained in software.
arXiv Detail & Related papers (2024-09-23T11:16:46Z) - Biologically Plausible Learning on Neuromorphic Hardware Architectures [27.138481022472]
Neuromorphic computing is an emerging paradigm that confronts this imbalance by computations directly in analog memories.
This work is the first to compare the impact of different learning algorithms on Compute-In-Memory-based hardware and vice versa.
arXiv Detail & Related papers (2022-12-29T15:10:59Z) - MAPLE-X: Latency Prediction with Explicit Microprocessor Prior Knowledge [87.41163540910854]
Deep neural network (DNN) latency characterization is a time-consuming process.
We propose MAPLE-X which extends MAPLE by incorporating explicit prior knowledge of hardware devices and DNN architecture latency.
arXiv Detail & Related papers (2022-05-25T11:08:20Z) - FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task.
The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources.
It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z) - An Adaptive Device-Edge Co-Inference Framework Based on Soft
Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices.
We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations.
Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z) - Quantized Neural Networks via {-1, +1} Encoding Decomposition and
Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks.
We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z) - Fully-parallel Convolutional Neural Network Hardware [0.7829352305480285]
We propose a new power-and-area-efficient architecture for implementing Articial Neural Networks (ANNs) in hardware.
For the first time, a fully-parallel CNN as LENET-5 is embedded and tested in a single FPGA.
arXiv Detail & Related papers (2020-06-22T17:19:09Z) - Compiling Spiking Neural Networks to Neuromorphic Hardware [4.273223677453178]
Spiking Neural Network (SNN) can lower the energy consumption of machine learning applications executed on neuromorphic hardware.
We propose an approach to analyze and compile SNNs on a resource-constrained neuromorphic hardware.
arXiv Detail & Related papers (2020-04-07T21:13:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.