Related papers: PulseDL-II: A System-on-Chip Neural Network Accelerator for Timing and Energy Extraction of Nuclear Detector Signals

PulseDL-II: A System-on-Chip Neural Network Accelerator for Timing and Energy Extraction of Nuclear Detector Signals

URL: http://arxiv.org/abs/2209.00884v1
Date: Fri, 2 Sep 2022 08:52:21 GMT
Title: PulseDL-II: A System-on-Chip Neural Network Accelerator for Timing and Energy Extraction of Nuclear Detector Signals
Authors: Pengcheng Ai, Zhi Deng, Yi Wang, Hui Gong, Xinchi Ran, Zijian Lang
Abstract summary: We introduce PulseDL-II, a system-on-chip (SoC) specially designed for applications of event feature (time, energy, etc.) extraction from pulses with deep learning. The proposed system achieved 60 ps time resolution and 0.40% energy resolution with online neural network inference at signal to noise ratio (SNR) of 47.4 dB.
Score: 3.307097167756987
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Front-end electronics equipped with high-speed digitizers are being used and proposed for future nuclear detectors. Recent literature reveals that deep learning models, especially one-dimensional convolutional neural networks, are promising when dealing with digital signals from nuclear detectors. Simulations and experiments demonstrate the satisfactory accuracy and additional benefits of neural networks in this area. However, specific hardware accelerating such models for online operations still needs to be studied. In this work, we introduce PulseDL-II, a system-on-chip (SoC) specially designed for applications of event feature (time, energy, etc.) extraction from pulses with deep learning. Based on the previous version, PulseDL-II incorporates a RISC CPU into the system structure for better functional flexibility and integrity. The neural network accelerator in the SoC adopts a three-level (arithmetic unit, processing element, neural network) hierarchical architecture and facilitates parameter optimization of the digital design. Furthermore, we devise a quantization scheme and associated implementation methods (rescale & bit-shift) for full compatibility with deep learning frameworks (e.g., TensorFlow) within a selected subset of layer types. With the current scheme, the quantization-aware training of neural networks is supported, and network models are automatically transformed into software of RISC CPU by dedicated scripts, with nearly no loss of accuracy. We validate PulseDL-II on field programmable gate arrays (FPGA). Finally, system validation is done with an experimental setup made up of a direct digital synthesis (DDS) signal generator and an FPGA development board with analog-to-digital converters (ADC). The proposed system achieved 60 ps time resolution and 0.40% energy resolution with online neural network inference at signal to noise ratio (SNR) of 47.4 dB.

Related papers

Neuromorphic Wireless Split Computing with Multi-Level Spikes [69.73249913506042]
In neuromorphic computing, spiking neural networks (SNNs) perform inference tasks, offering significant efficiency gains for workloads involving sequential data. Recent advances in hardware and software have demonstrated that embedding a few bits of payload in each spike exchanged between the spiking neurons can further enhance inference accuracy. This paper investigates a wireless neuromorphic split computing architecture employing multi-level SNNs.
arXiv Detail & Related papers (2024-11-07T14:08:35Z)
Adaptive Robotic Arm Control with a Spiking Recurrent Neural Network on a Digital Accelerator [41.60361484397962]
We present an overview of the system, and a Python framework to use it on a Pynq ZU platform. We show how the simulated accuracy is preserved with a peak performance of 3.8M events processed per second.
arXiv Detail & Related papers (2024-05-21T14:59:39Z)
ClST: A Convolutional Transformer Framework for Automatic Modulation Recognition by Knowledge Distillation [23.068233043023834]
We propose a novel neural network named convolution-linked signal transformer (ClST) and a novel knowledge distillation method named signal knowledge distillation (SKD) The SKD is a knowledge distillation method to effectively reduce the parameters and complexity of neural networks. We train two lightweight neural networks using the SKD algorithm, KD-CNN and KD-MobileNet, to meet the demand that neural networks can be used on miniaturized devices.
arXiv Detail & Related papers (2023-12-29T03:01:46Z)
DYNAP-SE2: a scalable multi-core dynamic neuromorphic asynchronous spiking neural network processor [2.9175555050594975]
We present a brain-inspired platform for prototyping real-time event-based Spiking Neural Networks (SNNs) The system proposed supports the direct emulation of dynamic and realistic neural processing phenomena such as short-term plasticity, NMDA gating, AMPA diffusion, homeostasis, spike frequency adaptation, conductance-based dendritic compartments and spike transmission delays. The flexibility to emulate different biologically plausible neural networks, and the chip's ability to monitor both population and single neuron signals in real-time, allow to develop and validate complex models of neural processing for both basic research and edge-computing applications.
arXiv Detail & Related papers (2023-10-01T03:48:16Z)
Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency. We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z)
Deep Convolutional Learning-Aided Detector for Generalized Frequency Division Multiplexing with Index Modulation [0.0]
The proposed method first pre-processes the received signal by using a zero-forcing (ZF) detector and then uses a neural network consisting of a convolutional neural network (CNN) followed by a fully-connected neural network (FCNN) The FCNN part uses only two fully-connected layers, which can be adapted to yield a trade-off between complexity and bit error rate (BER) performance. It has been demonstrated that the proposed deep convolutional neural network-based detection and demodulation scheme provides better BER performance compared to ZF detector with a reasonable complexity increase.
arXiv Detail & Related papers (2022-02-06T22:18:42Z)
Two-Timescale End-to-End Learning for Channel Acquisition and Hybrid Precoding [94.40747235081466]
We propose an end-to-end deep learning-based joint transceiver design algorithm for millimeter wave (mmWave) massive multiple-input multiple-output (MIMO) systems. We develop a DNN architecture that maps the received pilots into feedback bits at the receiver, and then further maps the feedback bits into the hybrid precoder at the transmitter.
arXiv Detail & Related papers (2021-10-22T20:49:02Z)
A quantum algorithm for training wide and deep classical neural networks [72.2614468437919]
We show that conditions amenable to classical trainability via gradient descent coincide with those necessary for efficiently solving quantum linear systems. We numerically demonstrate that the MNIST image dataset satisfies such conditions. We provide empirical evidence for $O(log n)$ training of a convolutional neural network with pooling.
arXiv Detail & Related papers (2021-07-19T23:41:03Z)
Decentralizing Feature Extraction with Quantum Convolutional Neural Network for Automatic Speech Recognition [101.69873988328808]
We build upon a quantum convolutional neural network (QCNN) composed of a quantum circuit encoder for feature extraction. An input speech is first up-streamed to a quantum computing server to extract Mel-spectrogram. The corresponding convolutional features are encoded using a quantum circuit algorithm with random parameters. The encoded features are then down-streamed to the local RNN model for the final recognition.
arXiv Detail & Related papers (2020-10-26T03:36:01Z)
Training End-to-End Analog Neural Networks with Equilibrium Propagation [64.0476282000118]
We introduce a principled method to train end-to-end analog neural networks by gradient descent. We show mathematically that a class of analog neural networks (called nonlinear resistive networks) are energy-based models. Our work can guide the development of a new generation of ultra-fast, compact and low-power neural networks supporting on-chip learning.
arXiv Detail & Related papers (2020-06-02T23:38:35Z)
Compressing deep neural networks on FPGAs to binary and ternary precision with HLS4ML [13.325670094073383]
We present the implementation of binary and ternary neural networks in the hls4ml library. We discuss the trade-off between model accuracy and resource consumption. The binary and ternary implementation has similar performance to the higher precision implementation while using drastically fewer FPGA resources.
arXiv Detail & Related papers (2020-03-11T10:46:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.