Related papers: LogicNets: Co-Designed Neural Networks and Circuits for Extreme-Throughput Applications

LogicNets: Co-Designed Neural Networks and Circuits for Extreme-Throughput Applications

URL: http://arxiv.org/abs/2004.03021v1
Date: Mon, 6 Apr 2020 22:15:41 GMT
Title: LogicNets: Co-Designed Neural Networks and Circuits for Extreme-Throughput Applications
Authors: Yaman Umuroglu, Yash Akhauri, Nicholas J. Fraser, Michaela Blott
Abstract summary: We present a novel method for designing neural network topologies that directly map to a highly efficient FPGA implementation. We show that the combination of sparsity and low-bit activation quantization results in high-speed circuits with small logic depth and low LUT cost.
Score: 6.9276012494882835
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deployment of deep neural networks for applications that require very high throughput or extremely low latency is a severe computational challenge, further exacerbated by inefficiencies in mapping the computation to hardware. We present a novel method for designing neural network topologies that directly map to a highly efficient FPGA implementation. By exploiting the equivalence of artificial neurons with quantized inputs/outputs and truth tables, we can train quantized neural networks that can be directly converted to a netlist of truth tables, and subsequently deployed as a highly pipelinable, massively parallel FPGA circuit. However, the neural network topology requires careful consideration since the hardware cost of truth tables grows exponentially with neuron fan-in. To obtain smaller networks where the whole netlist can be placed-and-routed onto a single FPGA, we derive a fan-in driven hardware cost model to guide topology design, and combine high sparsity with low-bit activation quantization to limit the neuron fan-in. We evaluate our approach on two tasks with very high intrinsic throughput requirements in high-energy physics and network intrusion detection. We show that the combination of sparsity and low-bit activation quantization results in high-speed circuits with small logic depth and low LUT cost, demonstrating competitive accuracy with less than 15 ns of inference latency and throughput in the hundreds of millions of inferences per second.

Related papers

KLiNQ: Knowledge Distillation-Assisted Lightweight Neural Network for Qubit Readout on FPGA [1.2832548102732355]
This paper presents KLiNQ, a novel qubit readout architecture leveraging lightweight neural networks optimized via knowledge distillation. Our approach achieves around a 99% reduction in model size compared to the baseline while maintaining a qubit-state discrimination accuracy of 91%.
arXiv Detail & Related papers (2025-03-05T14:28:16Z)
Low-latency machine learning FPGA accelerator for multi-qubit-state discrimination [1.6773398825542363]
Measuring a qubit state is a fundamental yet error-prone operation in quantum computing. Here, we utilize an integrated approach to deploy neural networks onto field-programmable gate arrays (FPGA) We demonstrate that implementing a fully connected neural network accelerator for multi-qubit readout is advantageous.
arXiv Detail & Related papers (2024-07-04T11:34:43Z)
NeuraLUT: Hiding Neural Network Density in Boolean Synthesizable Functions [2.7086888205833968]
Field-Programmable Gate Array (FPGA) accelerators have proven successful in handling latency- and resource-critical deep neural network (DNN) inference tasks. We propose relaxing the boundaries of neurons and mapping entire sub-networks to a single LUT. We validate our proposed method on a known latency-critical task, jet substructure tagging, and on the classical computer vision task, digit classification using MNIST.
arXiv Detail & Related papers (2024-02-29T16:10:21Z)
Quantization-aware Interval Bound Propagation for Training Certifiably Robust Quantized Neural Networks [58.195261590442406]
We study the problem of training and certifying adversarially robust quantized neural networks (QNNs) Recent work has shown that floating-point neural networks that have been verified to be robust can become vulnerable to adversarial attacks after quantization. We present quantization-aware interval bound propagation (QA-IBP), a novel method for training robust QNNs.
arXiv Detail & Related papers (2022-11-29T13:32:38Z)
Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency. We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z)
Signal Detection in MIMO Systems with Hardware Imperfections: Message Passing on Neural Networks [101.59367762974371]
In this paper, we investigate signal detection in multiple-input-multiple-output (MIMO) communication systems with hardware impairments. It is difficult to train a deep neural network (DNN) with limited pilot signals, hindering its practical applications. We design an efficient message passing based Bayesian signal detector, leveraging the unitary approximate message passing (UAMP) algorithm.
arXiv Detail & Related papers (2022-10-08T04:32:58Z)
Robust Training and Verification of Implicit Neural Networks: A Non-Euclidean Contractive Approach [64.23331120621118]
This paper proposes a theoretical and computational framework for training and robustness verification of implicit neural networks. We introduce a related embedded network and show that the embedded network can be used to provide an $ell_infty$-norm box over-approximation of the reachable sets of the original network. We apply our algorithms to train implicit neural networks on the MNIST dataset and compare the robustness of our models with the models trained via existing approaches in the literature.
arXiv Detail & Related papers (2022-08-08T03:13:24Z)
A quantum algorithm for training wide and deep classical neural networks [72.2614468437919]
We show that conditions amenable to classical trainability via gradient descent coincide with those necessary for efficiently solving quantum linear systems. We numerically demonstrate that the MNIST image dataset satisfies such conditions. We provide empirical evidence for $O(log n)$ training of a convolutional neural network with pooling.
arXiv Detail & Related papers (2021-07-19T23:41:03Z)
Quantized Neural Networks via {-1, +1} Encoding Decomposition and Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks. We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z)
A White Paper on Neural Network Quantization [20.542729144379223]
We introduce state-of-the-art algorithms for mitigating the impact of quantization noise on the network's performance. We consider two main classes of algorithms: Post-Training Quantization (PTQ) and Quantization-Aware-Training (QAT)
arXiv Detail & Related papers (2021-06-15T17:12:42Z)
NullaNet Tiny: Ultra-low-latency DNN Inference Through Fixed-function Combinational Logic [4.119948826527649]
Field-programmable gate array (FPGA)-based accelerators are gaining traction as a serious contender to replace graphics processing unit/central processing unit-based platforms. This paper presents NullaNet Tiny, a framework for constructing resource and energy-efficient, ultra-low-latency FPGA-based neural network accelerators.
arXiv Detail & Related papers (2021-04-07T00:16:39Z)
Exposing Hardware Building Blocks to Machine Learning Frameworks [4.56877715768796]
We focus on how to design topologies that complement such a view of neurons as unique functions. We develop a library that supports training a neural network with custom sparsity and quantization.
arXiv Detail & Related papers (2020-04-10T14:26:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.