Efficient Compilation and Mapping of Fixed Function Combinational Logic
onto Digital Signal Processors Targeting Neural Network Inference and
Utilizing High-level Synthesis
- URL: http://arxiv.org/abs/2208.00302v1
- Date: Sat, 30 Jul 2022 20:11:59 GMT
- Title: Efficient Compilation and Mapping of Fixed Function Combinational Logic
onto Digital Signal Processors Targeting Neural Network Inference and
Utilizing High-level Synthesis
- Authors: Soheil Nazar Shahsavani, Arash Fayyazi, Mahdi Nazemi, and Massoud
Pedram
- Abstract summary: Recent efforts for improving the performance of neural network (NN) accelerators have given rise to a new trend of logic-based NN inference relying on fixed function combinational logic.
This paper presents an innovative design and optimization methodology for compilation and mapping of NNs, utilizing fixed function combinational logic to DSPs on FPGAs.
- Score: 3.83610794195621
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent efforts for improving the performance of neural network (NN)
accelerators that meet today's application requirements have given rise to a
new trend of logic-based NN inference relying on fixed function combinational
logic. Mapping such large Boolean functions with many input variables and
product terms to digital signal processors (DSPs) on Field-programmable gate
arrays (FPGAs) needs a novel framework considering the structure and the
reconfigurability of DSP blocks during this process. The proposed methodology
in this paper maps the fixed function combinational logic blocks to a set of
Boolean functions where Boolean operations corresponding to each function are
mapped to DSP devices rather than look-up tables (LUTs) on the FPGAs to take
advantage of the high performance, low latency, and parallelism of DSP blocks.
% This paper also presents an innovative design and optimization methodology
for compilation and mapping of NNs, utilizing fixed function combinational
logic to DSPs on FPGAs employing high-level synthesis flow. % Our experimental
evaluations across several \REVone{datasets} and selected NNs demonstrate the
comparable performance of our framework in terms of the inference latency and
output accuracy compared to prior art FPGA-based NN accelerators employing
DSPs.
Related papers
- Neuromorphic Wireless Split Computing with Multi-Level Spikes [69.73249913506042]
In neuromorphic computing, spiking neural networks (SNNs) perform inference tasks, offering significant efficiency gains for workloads involving sequential data.
Recent advances in hardware and software have demonstrated that embedding a few bits of payload in each spike exchanged between the spiking neurons can further enhance inference accuracy.
This paper investigates a wireless neuromorphic split computing architecture employing multi-level SNNs.
arXiv Detail & Related papers (2024-11-07T14:08:35Z) - Harnessing FPGA Technology for Enhanced Biomedical Computation [0.0]
This research delves into sophisticated neural network frameworks like CNN, Recurrent Neural Networks (RNN), Long Short-Term Memory Networks (LSTMs), and Deep Belief Networks (DBNs)
By evaluating performance indicators like latency and throughput, we showcase the efficacy of FPGAs in advanced biomedical computing.
arXiv Detail & Related papers (2023-11-21T08:51:58Z) - Pointer Networks with Q-Learning for Combinatorial Optimization [55.2480439325792]
We introduce the Pointer Q-Network (PQN), a hybrid neural architecture that integrates model-free Q-value policy approximation with Pointer Networks (Ptr-Nets)
Our empirical results demonstrate the efficacy of this approach, also testing the model in unstable environments.
arXiv Detail & Related papers (2023-11-05T12:03:58Z) - Exploiting FPGA Capabilities for Accelerated Biomedical Computing [0.0]
This study presents advanced neural network architectures for enhanced ECG signal analysis using Field Programmable Gate Arrays (FPGAs)
We utilize the MIT-BIH Arrhythmia Database for training and validation, introducing Gaussian noise to improve robustness.
The study ultimately offers a guide for optimizing neural network performance on FPGAs for various applications.
arXiv Detail & Related papers (2023-07-16T01:20:17Z) - Reconfigurable Distributed FPGA Cluster Design for Deep Learning
Accelerators [59.11160990637615]
We propose a distributed system based on lowpower embedded FPGAs designed for edge computing applications.
The proposed system can simultaneously execute diverse Neural Network (NN) models, arrange the graph in a pipeline structure, and manually allocate greater resources to the most computationally intensive layers of the NN graph.
arXiv Detail & Related papers (2023-05-24T16:08:55Z) - Energy-efficient Task Adaptation for NLP Edge Inference Leveraging
Heterogeneous Memory Architectures [68.91874045918112]
adapter-ALBERT is an efficient model optimization for maximal data reuse across different tasks.
We demonstrate the advantage of mapping the model to a heterogeneous on-chip memory architecture by performing simulations on a validated NLP edge accelerator.
arXiv Detail & Related papers (2023-03-25T14:40:59Z) - Implementing Neural Network-Based Equalizers in a Coherent Optical
Transmission System Using Field-Programmable Gate Arrays [3.1543509940301946]
We show the offline FPGA realization of both recurrent and feedforward neural network (NN)-based equalizers for nonlinearity compensation in coherent optical transmission systems.
The main results are divided into three parts: a performance comparison, an analysis of how activation functions are implemented, and a report on the complexity of the hardware.
arXiv Detail & Related papers (2022-12-09T07:28:45Z) - Decomposition of Matrix Product States into Shallow Quantum Circuits [62.5210028594015]
tensor network (TN) algorithms can be mapped to parametrized quantum circuits (PQCs)
We propose a new protocol for approximating TN states using realistic quantum circuits.
Our results reveal one particular protocol, involving sequential growth and optimization of the quantum circuit, to outperform all other methods.
arXiv Detail & Related papers (2022-09-01T17:08:41Z) - N3H-Core: Neuron-designed Neural Network Accelerator via FPGA-based
Heterogeneous Computing Cores [26.38812379700231]
We develop an FPGA-based heterogeneous computing system for neural network acceleration.
The proposed accelerator consists of DSP- and LUT-based GEneral Matrix-Multiplication (GEMM) computing cores.
Our design outperforms the state-of-the-art Mix&Match design with latency reduced by 1.12-1.32x with higher inference accuracy.
arXiv Detail & Related papers (2021-12-15T15:12:00Z) - NullaNet Tiny: Ultra-low-latency DNN Inference Through Fixed-function
Combinational Logic [4.119948826527649]
Field-programmable gate array (FPGA)-based accelerators are gaining traction as a serious contender to replace graphics processing unit/central processing unit-based platforms.
This paper presents NullaNet Tiny, a framework for constructing resource and energy-efficient, ultra-low-latency FPGA-based neural network accelerators.
arXiv Detail & Related papers (2021-04-07T00:16:39Z) - Learning to Solve the AC-OPF using Sensitivity-Informed Deep Neural
Networks [52.32646357164739]
We propose a deep neural network (DNN) to solve the solutions of the optimal power flow (ACOPF)
The proposed SIDNN is compatible with a broad range of OPF schemes.
It can be seamlessly integrated in other learning-to-OPF schemes.
arXiv Detail & Related papers (2021-03-27T00:45:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.