Related papers: OpenHLS: High-Level Synthesis for Low-Latency Deep Neural Networks for Experimental Science

OpenHLS: High-Level Synthesis for Low-Latency Deep Neural Networks for Experimental Science

URL: http://arxiv.org/abs/2302.06751v2
Date: Wed, 15 Feb 2023 16:51:43 GMT
Title: OpenHLS: High-Level Synthesis for Low-Latency Deep Neural Networks for Experimental Science
Authors: Maksim Levental, Arham Khan, Ryan Chard, Kazutomo Yoshi, Kyle Chard, Ian Foster
Abstract summary: We present an open source, lightweight, compiler framework for translating high-level representations of deep neural networks to low-level representations. We show OpenHLS is able to produce an implementation of the network with a throughput 4.8 $mu$s/sample, which is approximately a 4$times$ improvement over the existing implementation.
Score: 0.6571063542099524
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In many experiment-driven scientific domains, such as high-energy physics, material science, and cosmology, high data rate experiments impose hard constraints on data acquisition systems: collected data must either be indiscriminately stored for post-processing and analysis, thereby necessitating large storage capacity, or accurately filtered in real-time, thereby necessitating low-latency processing. Deep neural networks, effective in other filtering tasks, have not been widely employed in such data acquisition systems, due to design and deployment difficulties. We present an open source, lightweight, compiler framework, without any proprietary dependencies, OpenHLS, based on high-level synthesis techniques, for translating high-level representations of deep neural networks to low-level representations, suitable for deployment to near-sensor devices such as field-programmable gate arrays. We evaluate OpenHLS on various workloads and present a case-study implementation of a deep neural network for Bragg peak detection in the context of high-energy diffraction microscopy. We show OpenHLS is able to produce an implementation of the network with a throughput 4.8 $\mu$s/sample, which is approximately a 4$\times$ improvement over the existing implementation

Related papers

Event-Driven Implementation of a Physical Reservoir Computing Framework for superficial EMG-based Gesture Recognition [2.222098162797332]
This paper explores a novel neuromorphic implementation approach for gesture recognition by extracting spiking information from surface electromyography (sEMG) data in an event-driven manner. The network was designed by implementing a simple-structured and hardware-friendly Physical Reservoir Computing framework called Rotating Neuron Reservoir (RNR) within the domain of Spiking neural network (SNN) The proposed system was validated by an open-access large-scale sEMG database and achieved an average classification accuracy of 74.6% and 80.3%.
arXiv Detail & Related papers (2025-03-10T17:18:14Z)
Application-oriented automatic hyperparameter optimization for spiking neural network prototyping [0.0]
This document uses the Neural Network Intelligence (NNI) toolkit as reference framework to present one such solution. A summary of published works employing the presented pipeline is reported as possible source of insights into application-oriented HPO experiments for SNN prototyping.
arXiv Detail & Related papers (2025-02-13T14:49:44Z)
Spectrum-Informed Multistage Neural Networks: Multiscale Function Approximators of Machine Precision [1.2663244405597374]
We propose using the novel multistage neural network approach to learn the residue from the previous stage. We successfully tackle the spectral bias of neural networks. This approach allows the neural network to fit target functions to double floating-point machine precision.
arXiv Detail & Related papers (2024-07-24T12:11:09Z)
Leveraging Frequency Domain Learning in 3D Vessel Segmentation [50.54833091336862]
In this study, we leverage Fourier domain learning as a substitute for multi-scale convolutional kernels in 3D hierarchical segmentation models. We show that our novel network achieves remarkable dice performance (84.37% on ASACA500 and 80.32% on ImageCAS) in tubular vessel segmentation tasks.
arXiv Detail & Related papers (2024-01-11T19:07:58Z)
Complex-Valued Neural Networks for Data-Driven Signal Processing and Signal Understanding [1.2691047660244337]
Complex-valued neural networks have emerged boasting superior modeling performance for many tasks across the signal processing, sensing, and communications arenas. This paper overviews a package built on PyTorch with the intention of implementing light-weight interfaces for common complex-valued neural network operations and architectures.
arXiv Detail & Related papers (2023-09-14T16:55:28Z)
DLDNN: Deterministic Lateral Displacement Design Automation by Neural Networks [1.8365768330479992]
This paper investigates a fast versatile design automation platform to address Deterministic lateral displacement (DLD) problems. convolutional and artificial neural networks were employed to learn velocity fields and critical diameters of a range of DLD configurations. The developed tool was tested for 12 critical conditions and performed reliably with errors of less than 4%.
arXiv Detail & Related papers (2022-08-30T14:38:17Z)
Learning Neural Light Fields with Ray-Space Embedding Networks [51.88457861982689]
We propose a novel neural light field representation that is compact and directly predicts integrated radiance along rays. Our method achieves state-of-the-art quality on dense forward-facing datasets such as the Stanford Light Field dataset.
arXiv Detail & Related papers (2021-12-02T18:59:51Z)
CondenseNeXt: An Ultra-Efficient Deep Neural Network for Embedded Systems [0.0]
A Convolutional Neural Network (CNN) is a class of Deep Neural Network (DNN) widely used in the analysis of visual images captured by an image sensor. In this paper, we propose a neoteric variant of deep convolutional neural network architecture to ameliorate the performance of existing CNN architectures for real-time inference on embedded systems.
arXiv Detail & Related papers (2021-12-01T18:20:52Z)
Learning Frequency-aware Dynamic Network for Efficient Super-Resolution [56.98668484450857]
This paper explores a novel frequency-aware dynamic network for dividing the input into multiple parts according to its coefficients in the discrete cosine transform (DCT) domain. In practice, the high-frequency part will be processed using expensive operations and the lower-frequency part is assigned with cheap operations to relieve the computation burden. Experiments conducted on benchmark SISR models and datasets show that the frequency-aware dynamic network can be employed for various SISR neural architectures.
arXiv Detail & Related papers (2021-03-15T12:54:26Z)
Exploiting Heterogeneity in Operational Neural Networks by Synaptic Plasticity [87.32169414230822]
Recently proposed network model, Operational Neural Networks (ONNs), can generalize the conventional Convolutional Neural Networks (CNNs) In this study the focus is drawn on searching the best-possible operator set(s) for the hidden neurons of the network based on the Synaptic Plasticity paradigm that poses the essential learning theory in biological neurons. Experimental results over highly challenging problems demonstrate that the elite ONNs even with few neurons and layers can achieve a superior learning performance than GIS-based ONNs.
arXiv Detail & Related papers (2020-08-21T19:03:23Z)
EagerNet: Early Predictions of Neural Networks for Computationally Efficient Intrusion Detection [2.223733768286313]
We propose a new architecture to detect network attacks with minimal resources. The architecture is able to deal with either binary or multiclass classification problems and trades prediction speed for the accuracy of the network.
arXiv Detail & Related papers (2020-07-27T11:31:37Z)
ALF: Autoencoder-based Low-rank Filter-sharing for Efficient Convolutional Neural Networks [63.91384986073851]
We propose the autoencoder-based low-rank filter-sharing technique technique (ALF) ALF shows a reduction of 70% in network parameters, 61% in operations and 41% in execution time, with minimal loss in accuracy.
arXiv Detail & Related papers (2020-07-27T09:01:22Z)
Large-Scale Gradient-Free Deep Learning with Recursive Local Representation Alignment [84.57874289554839]
Training deep neural networks on large-scale datasets requires significant hardware resources. Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize. We propose a neuro-biologically-plausible alternative to backprop that can be used to train deep networks.
arXiv Detail & Related papers (2020-02-10T16:20:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.