PulseDL-II: A System-on-Chip Neural Network Accelerator for Timing and
Energy Extraction of Nuclear Detector Signals
- URL: http://arxiv.org/abs/2209.00884v1
- Date: Fri, 2 Sep 2022 08:52:21 GMT
- Title: PulseDL-II: A System-on-Chip Neural Network Accelerator for Timing and
Energy Extraction of Nuclear Detector Signals
- Authors: Pengcheng Ai, Zhi Deng, Yi Wang, Hui Gong, Xinchi Ran, Zijian Lang
- Abstract summary: We introduce PulseDL-II, a system-on-chip (SoC) specially designed for applications of event feature (time, energy, etc.) extraction from pulses with deep learning.
The proposed system achieved 60 ps time resolution and 0.40% energy resolution with online neural network inference at signal to noise ratio (SNR) of 47.4 dB.
- Score: 3.307097167756987
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Front-end electronics equipped with high-speed digitizers are being used and
proposed for future nuclear detectors. Recent literature reveals that deep
learning models, especially one-dimensional convolutional neural networks, are
promising when dealing with digital signals from nuclear detectors. Simulations
and experiments demonstrate the satisfactory accuracy and additional benefits
of neural networks in this area. However, specific hardware accelerating such
models for online operations still needs to be studied. In this work, we
introduce PulseDL-II, a system-on-chip (SoC) specially designed for
applications of event feature (time, energy, etc.) extraction from pulses with
deep learning. Based on the previous version, PulseDL-II incorporates a RISC
CPU into the system structure for better functional flexibility and integrity.
The neural network accelerator in the SoC adopts a three-level (arithmetic
unit, processing element, neural network) hierarchical architecture and
facilitates parameter optimization of the digital design. Furthermore, we
devise a quantization scheme and associated implementation methods (rescale &
bit-shift) for full compatibility with deep learning frameworks (e.g.,
TensorFlow) within a selected subset of layer types. With the current scheme,
the quantization-aware training of neural networks is supported, and network
models are automatically transformed into software of RISC CPU by dedicated
scripts, with nearly no loss of accuracy. We validate PulseDL-II on field
programmable gate arrays (FPGA). Finally, system validation is done with an
experimental setup made up of a direct digital synthesis (DDS) signal generator
and an FPGA development board with analog-to-digital converters (ADC). The
proposed system achieved 60 ps time resolution and 0.40% energy resolution with
online neural network inference at signal to noise ratio (SNR) of 47.4 dB.
Related papers
- Neuromorphic Wireless Split Computing with Multi-Level Spikes [69.73249913506042]
In neuromorphic computing, spiking neural networks (SNNs) perform inference tasks, offering significant efficiency gains for workloads involving sequential data.
Recent advances in hardware and software have demonstrated that embedding a few bits of payload in each spike exchanged between the spiking neurons can further enhance inference accuracy.
This paper investigates a wireless neuromorphic split computing architecture employing multi-level SNNs.
arXiv Detail & Related papers (2024-11-07T14:08:35Z) - Adaptive Robotic Arm Control with a Spiking Recurrent Neural Network on a Digital Accelerator [41.60361484397962]
We present an overview of the system, and a Python framework to use it on a Pynq ZU platform.
We show how the simulated accuracy is preserved with a peak performance of 3.8M events processed per second.
arXiv Detail & Related papers (2024-05-21T14:59:39Z) - ClST: A Convolutional Transformer Framework for Automatic Modulation
Recognition by Knowledge Distillation [23.068233043023834]
We propose a novel neural network named convolution-linked signal transformer (ClST) and a novel knowledge distillation method named signal knowledge distillation (SKD)
The SKD is a knowledge distillation method to effectively reduce the parameters and complexity of neural networks.
We train two lightweight neural networks using the SKD algorithm, KD-CNN and KD-MobileNet, to meet the demand that neural networks can be used on miniaturized devices.
arXiv Detail & Related papers (2023-12-29T03:01:46Z) - DYNAP-SE2: a scalable multi-core dynamic neuromorphic asynchronous
spiking neural network processor [2.9175555050594975]
We present a brain-inspired platform for prototyping real-time event-based Spiking Neural Networks (SNNs)
The system proposed supports the direct emulation of dynamic and realistic neural processing phenomena such as short-term plasticity, NMDA gating, AMPA diffusion, homeostasis, spike frequency adaptation, conductance-based dendritic compartments and spike transmission delays.
The flexibility to emulate different biologically plausible neural networks, and the chip's ability to monitor both population and single neuron signals in real-time, allow to develop and validate complex models of neural processing for both basic research and edge-computing applications.
arXiv Detail & Related papers (2023-10-01T03:48:16Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Deep Convolutional Learning-Aided Detector for Generalized Frequency
Division Multiplexing with Index Modulation [0.0]
The proposed method first pre-processes the received signal by using a zero-forcing (ZF) detector and then uses a neural network consisting of a convolutional neural network (CNN) followed by a fully-connected neural network (FCNN)
The FCNN part uses only two fully-connected layers, which can be adapted to yield a trade-off between complexity and bit error rate (BER) performance.
It has been demonstrated that the proposed deep convolutional neural network-based detection and demodulation scheme provides better BER performance compared to ZF detector with a reasonable complexity increase.
arXiv Detail & Related papers (2022-02-06T22:18:42Z) - Two-Timescale End-to-End Learning for Channel Acquisition and Hybrid
Precoding [94.40747235081466]
We propose an end-to-end deep learning-based joint transceiver design algorithm for millimeter wave (mmWave) massive multiple-input multiple-output (MIMO) systems.
We develop a DNN architecture that maps the received pilots into feedback bits at the receiver, and then further maps the feedback bits into the hybrid precoder at the transmitter.
arXiv Detail & Related papers (2021-10-22T20:49:02Z) - A quantum algorithm for training wide and deep classical neural networks [72.2614468437919]
We show that conditions amenable to classical trainability via gradient descent coincide with those necessary for efficiently solving quantum linear systems.
We numerically demonstrate that the MNIST image dataset satisfies such conditions.
We provide empirical evidence for $O(log n)$ training of a convolutional neural network with pooling.
arXiv Detail & Related papers (2021-07-19T23:41:03Z) - Decentralizing Feature Extraction with Quantum Convolutional Neural
Network for Automatic Speech Recognition [101.69873988328808]
We build upon a quantum convolutional neural network (QCNN) composed of a quantum circuit encoder for feature extraction.
An input speech is first up-streamed to a quantum computing server to extract Mel-spectrogram.
The corresponding convolutional features are encoded using a quantum circuit algorithm with random parameters.
The encoded features are then down-streamed to the local RNN model for the final recognition.
arXiv Detail & Related papers (2020-10-26T03:36:01Z) - Training End-to-End Analog Neural Networks with Equilibrium Propagation [64.0476282000118]
We introduce a principled method to train end-to-end analog neural networks by gradient descent.
We show mathematically that a class of analog neural networks (called nonlinear resistive networks) are energy-based models.
Our work can guide the development of a new generation of ultra-fast, compact and low-power neural networks supporting on-chip learning.
arXiv Detail & Related papers (2020-06-02T23:38:35Z) - Compressing deep neural networks on FPGAs to binary and ternary
precision with HLS4ML [13.325670094073383]
We present the implementation of binary and ternary neural networks in the hls4ml library.
We discuss the trade-off between model accuracy and resource consumption.
The binary and ternary implementation has similar performance to the higher precision implementation while using drastically fewer FPGA resources.
arXiv Detail & Related papers (2020-03-11T10:46:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.