LogicNets: Co-Designed Neural Networks and Circuits for
Extreme-Throughput Applications
- URL: http://arxiv.org/abs/2004.03021v1
- Date: Mon, 6 Apr 2020 22:15:41 GMT
- Title: LogicNets: Co-Designed Neural Networks and Circuits for
Extreme-Throughput Applications
- Authors: Yaman Umuroglu, Yash Akhauri, Nicholas J. Fraser, Michaela Blott
- Abstract summary: We present a novel method for designing neural network topologies that directly map to a highly efficient FPGA implementation.
We show that the combination of sparsity and low-bit activation quantization results in high-speed circuits with small logic depth and low LUT cost.
- Score: 6.9276012494882835
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deployment of deep neural networks for applications that require very high
throughput or extremely low latency is a severe computational challenge,
further exacerbated by inefficiencies in mapping the computation to hardware.
We present a novel method for designing neural network topologies that directly
map to a highly efficient FPGA implementation. By exploiting the equivalence of
artificial neurons with quantized inputs/outputs and truth tables, we can train
quantized neural networks that can be directly converted to a netlist of truth
tables, and subsequently deployed as a highly pipelinable, massively parallel
FPGA circuit. However, the neural network topology requires careful
consideration since the hardware cost of truth tables grows exponentially with
neuron fan-in. To obtain smaller networks where the whole netlist can be
placed-and-routed onto a single FPGA, we derive a fan-in driven hardware cost
model to guide topology design, and combine high sparsity with low-bit
activation quantization to limit the neuron fan-in. We evaluate our approach on
two tasks with very high intrinsic throughput requirements in high-energy
physics and network intrusion detection. We show that the combination of
sparsity and low-bit activation quantization results in high-speed circuits
with small logic depth and low LUT cost, demonstrating competitive accuracy
with less than 15 ns of inference latency and throughput in the hundreds of
millions of inferences per second.
Related papers
- Low-latency machine learning FPGA accelerator for multi-qubit-state discrimination [1.6773398825542363]
Measuring a qubit state is a fundamental yet error-prone operation in quantum computing.
Here, we utilize an integrated approach to deploy neural networks onto field-programmable gate arrays (FPGA)
We demonstrate that implementing a fully connected neural network accelerator for multi-qubit readout is advantageous.
arXiv Detail & Related papers (2024-07-04T11:34:43Z) - NeuraLUT: Hiding Neural Network Density in Boolean Synthesizable Functions [2.7086888205833968]
Field-Programmable Gate Array (FPGA) accelerators have proven successful in handling latency- and resource-critical deep neural network (DNN) inference tasks.
We propose relaxing the boundaries of neurons and mapping entire sub-networks to a single LUT.
We validate our proposed method on a known latency-critical task, jet substructure tagging, and on the classical computer vision task, digit classification using MNIST.
arXiv Detail & Related papers (2024-02-29T16:10:21Z) - Quantization-aware Interval Bound Propagation for Training Certifiably
Robust Quantized Neural Networks [58.195261590442406]
We study the problem of training and certifying adversarially robust quantized neural networks (QNNs)
Recent work has shown that floating-point neural networks that have been verified to be robust can become vulnerable to adversarial attacks after quantization.
We present quantization-aware interval bound propagation (QA-IBP), a novel method for training robust QNNs.
arXiv Detail & Related papers (2022-11-29T13:32:38Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Signal Detection in MIMO Systems with Hardware Imperfections: Message
Passing on Neural Networks [101.59367762974371]
In this paper, we investigate signal detection in multiple-input-multiple-output (MIMO) communication systems with hardware impairments.
It is difficult to train a deep neural network (DNN) with limited pilot signals, hindering its practical applications.
We design an efficient message passing based Bayesian signal detector, leveraging the unitary approximate message passing (UAMP) algorithm.
arXiv Detail & Related papers (2022-10-08T04:32:58Z) - Robust Training and Verification of Implicit Neural Networks: A
Non-Euclidean Contractive Approach [64.23331120621118]
This paper proposes a theoretical and computational framework for training and robustness verification of implicit neural networks.
We introduce a related embedded network and show that the embedded network can be used to provide an $ell_infty$-norm box over-approximation of the reachable sets of the original network.
We apply our algorithms to train implicit neural networks on the MNIST dataset and compare the robustness of our models with the models trained via existing approaches in the literature.
arXiv Detail & Related papers (2022-08-08T03:13:24Z) - A quantum algorithm for training wide and deep classical neural networks [72.2614468437919]
We show that conditions amenable to classical trainability via gradient descent coincide with those necessary for efficiently solving quantum linear systems.
We numerically demonstrate that the MNIST image dataset satisfies such conditions.
We provide empirical evidence for $O(log n)$ training of a convolutional neural network with pooling.
arXiv Detail & Related papers (2021-07-19T23:41:03Z) - Quantized Neural Networks via {-1, +1} Encoding Decomposition and
Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks.
We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z) - A White Paper on Neural Network Quantization [20.542729144379223]
We introduce state-of-the-art algorithms for mitigating the impact of quantization noise on the network's performance.
We consider two main classes of algorithms: Post-Training Quantization (PTQ) and Quantization-Aware-Training (QAT)
arXiv Detail & Related papers (2021-06-15T17:12:42Z) - NullaNet Tiny: Ultra-low-latency DNN Inference Through Fixed-function
Combinational Logic [4.119948826527649]
Field-programmable gate array (FPGA)-based accelerators are gaining traction as a serious contender to replace graphics processing unit/central processing unit-based platforms.
This paper presents NullaNet Tiny, a framework for constructing resource and energy-efficient, ultra-low-latency FPGA-based neural network accelerators.
arXiv Detail & Related papers (2021-04-07T00:16:39Z) - Exposing Hardware Building Blocks to Machine Learning Frameworks [4.56877715768796]
We focus on how to design topologies that complement such a view of neurons as unique functions.
We develop a library that supports training a neural network with custom sparsity and quantization.
arXiv Detail & Related papers (2020-04-10T14:26:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.