Related papers: Low-Latency FPGA Control System for Real-Time Neural Network Processing in CCD-Based Trapped-Ion Qubit Measurement

Low-Latency FPGA Control System for Real-Time Neural Network Processing in CCD-Based Trapped-Ion Qubit Measurement

URL: http://arxiv.org/abs/2512.15838v1
Date: Wed, 17 Dec 2025 18:34:00 GMT
Title: Low-Latency FPGA Control System for Real-Time Neural Network Processing in CCD-Based Trapped-Ion Qubit Measurement
Authors: Binglei Lou, Gautham Duddi Krishnaswaroop, Filip Wojcicki, Ruilin Wu, Richard Rademacher, Zhiqiang Que, Wayne Luk, Philip H. W. Leong,
Abstract summary: This work benchmarks the latency of deep neural networks (DNNs)-based qubit detection on field-programmable gate arrays (FPGAs) and graphics processing units (GPUs)<n>The FPGA solution directly interfaces an electron-multiplying charge-coupled device (EMCCD) with the subsequent data processing logic, eliminating buffering and interface overheads.<n>We deploy Multilayer Perceptron (MLP) and Vision Transformer (ViT) models on hardware to evaluate measurement performance.
Score: 5.983860563083656
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Accurate and low-latency qubit state measurement is critical for trapped-ion quantum computing. While deep neural networks (DNNs) have been integrated to enhance detection fidelity, their latency performance on specific hardware platforms remains underexplored. This work benchmarks the latency of DNN-based qubit detection on field-programmable gate arrays (FPGAs) and graphics processing units (GPUs). The FPGA solution directly interfaces an electron-multiplying charge-coupled device (EMCCD) with the subsequent data processing logic, eliminating buffering and interface overheads. As a baseline, the GPU-based system employs a high-speed PCIe image grabber for image input and I/O card for state output. We deploy Multilayer Perceptron (MLP) and Vision Transformer (ViT) models on hardware to evaluate measurement performance. Compared to conventional thresholding, DNNs reduce the mean measurement fidelity (MMF) error by factors of 1.8-2.5x (one-qubit case) and 4.2-7.6x (three-qubit case). FPGA-based MLP and ViT achieve nanosecond- and microsecond-scale inference latencies, while the complete single-shot measurement process achieves over 100x speedup compared to the GPU implementation. Additionally, clock-cycle-level signal analysis reveals inefficiencies in EMCCD data transmission via Cameralink, suggesting that optimizing this interface could further leverage the advantages of ultra-low-latency DNN inference, guiding the development of next-generation qubit detection systems.

Related papers

LUNA: LUT-Based Neural Architecture for Fast and Low-Cost Qubit Readout [0.0]
LUNA is a superconducting qubit readout accelerator that combines low-cost integrator-based preprocessing with Look-Up Table (LUT) based neural networks.<n>We show up to a 10.95x reduction in area and 30% lower latency with little to no loss in fidelity compared to the state-of-the-art.
arXiv Detail & Related papers (2025-12-08T18:41:13Z)
Towards On-Device Learning and Reconfigurable Hardware Implementation for Encoded Single-Photon Signal Processing [0.0]
We propose an online training algorithm based on a One-Sided Jacobi rotation-based Online Sequential Extreme Learning Machine (OSOS-ELM)<n>We fully exploit parallelism in executing OSOS-ELM on a heterogeneous FPGA with integrated ARM cores.<n>We validate our approach through three case studies involving single-photon signal analysis.
arXiv Detail & Related papers (2025-04-12T00:58:52Z)
Enhancing Dropout-based Bayesian Neural Networks with Multi-Exit on FPGA [20.629635991749808]
This paper proposes an algorithm and hardware co-design framework that can generate field-programmable gate array (FPGA)-based accelerators for efficient BayesNNs. At the algorithm level, we propose novel multi-exit dropout-based BayesNNs with reduced computational and memory overheads. At the hardware level, this paper introduces a transformation framework that can generate FPGA-based accelerators for the proposed efficient BayesNNs.
arXiv Detail & Related papers (2024-06-20T17:08:42Z)
Embedded Graph Convolutional Networks for Real-Time Event Data Processing on SoC FPGAs [0.815557531820863]
We introduce a custom EFGCN (Event-based FPGA-accelerated Graph Convolutional Network) designed with a series of hardware-aware optimisations tailored for PointNetConv.<n>The proposed techniques result in up to 100-fold reduction in model size compared to Asynchronous Event-based GNN (AEGNN)<n>Our approach achieves state-of-the-art performance across multiple event-based classification benchmarks while remaining highly scalable, customisable and resource-efficient.
arXiv Detail & Related papers (2024-06-11T14:47:36Z)
Implementing Neural Network-Based Equalizers in a Coherent Optical Transmission System Using Field-Programmable Gate Arrays [3.1543509940301946]
We show the offline FPGA realization of both recurrent and feedforward neural network (NN)-based equalizers for nonlinearity compensation in coherent optical transmission systems. The main results are divided into three parts: a performance comparison, an analysis of how activation functions are implemented, and a report on the complexity of the hardware.
arXiv Detail & Related papers (2022-12-09T07:28:45Z)
Signal Detection in MIMO Systems with Hardware Imperfections: Message Passing on Neural Networks [101.59367762974371]
In this paper, we investigate signal detection in multiple-input-multiple-output (MIMO) communication systems with hardware impairments. It is difficult to train a deep neural network (DNN) with limited pilot signals, hindering its practical applications. We design an efficient message passing based Bayesian signal detector, leveraging the unitary approximate message passing (UAMP) algorithm.
arXiv Detail & Related papers (2022-10-08T04:32:58Z)
LL-GNN: Low Latency Graph Neural Networks on FPGAs for High Energy Physics [45.666822327616046]
This work presents a novel reconfigurable architecture for Low Graph Neural Network (LL-GNN) designs for particle detectors. The LL-GNN design advances the next generation of trigger systems by enabling sophisticated algorithms to process experimental data efficiently.
arXiv Detail & Related papers (2022-09-28T12:55:35Z)
MAPLE-X: Latency Prediction with Explicit Microprocessor Prior Knowledge [87.41163540910854]
Deep neural network (DNN) latency characterization is a time-consuming process. We propose MAPLE-X which extends MAPLE by incorporating explicit prior knowledge of hardware devices and DNN architecture latency.
arXiv Detail & Related papers (2022-05-25T11:08:20Z)
Two-Timescale End-to-End Learning for Channel Acquisition and Hybrid Precoding [94.40747235081466]
We propose an end-to-end deep learning-based joint transceiver design algorithm for millimeter wave (mmWave) massive multiple-input multiple-output (MIMO) systems. We develop a DNN architecture that maps the received pilots into feedback bits at the receiver, and then further maps the feedback bits into the hybrid precoder at the transmitter.
arXiv Detail & Related papers (2021-10-22T20:49:02Z)
NullaNet Tiny: Ultra-low-latency DNN Inference Through Fixed-function Combinational Logic [4.119948826527649]
Field-programmable gate array (FPGA)-based accelerators are gaining traction as a serious contender to replace graphics processing unit/central processing unit-based platforms. This paper presents NullaNet Tiny, a framework for constructing resource and energy-efficient, ultra-low-latency FPGA-based neural network accelerators.
arXiv Detail & Related papers (2021-04-07T00:16:39Z)
Interleaving: Modular architectures for fault-tolerant photonic quantum computing [50.591267188664666]
Photonic fusion-based quantum computing (FBQC) uses low-loss photonic delays. We present a modular architecture for FBQC in which these components are combined to form "interleaving modules" Exploiting the multiplicative power of delays, each module can add thousands of physical qubits to the computational Hilbert space.
arXiv Detail & Related papers (2021-03-15T18:00:06Z)
EdgeBERT: Sentence-Level Energy Optimizations for Latency-Aware Multi-Task NLP Inference [82.1584439276834]
Transformer-based language models such as BERT provide significant accuracy improvement for a multitude of natural language processing (NLP) tasks. We present EdgeBERT, an in-depth algorithm- hardware co-design for latency-aware energy optimization for multi-task NLP.
arXiv Detail & Related papers (2020-11-28T19:21:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.