Related papers: End-to-end workflow for machine learning-based qubit readout with QICK and hls4ml

End-to-end workflow for machine learning-based qubit readout with QICK and hls4ml

URL: http://arxiv.org/abs/2501.14663v1
Date: Fri, 24 Jan 2025 17:35:18 GMT
Title: End-to-end workflow for machine learning-based qubit readout with QICK and hls4ml
Authors: Giuseppe Di Guglielmo, Botao Du, Javier Campos, Alexandra Boltasseva, Akash V. Dixit, Farah Fahim, Zhaxylyk Kudyshev, Santiago Lopez, Ruichao Ma, Gabriel N. Perdue, Nhan Tran, Omer Yesilyurt, Daniel Bowring,
Abstract summary: We present an end-to-end workflow for superconducting qubit readout that embeds co-designed Neural Networks (NNs) into the Quantum Instrumentation Control Kit (QICK)<n>We aim to leverage machine learning (ML) to address critical challenges in qubit readout accuracy and scalability.<n>We experimentally demonstrate the design, optimization, and integration of an ML algorithm for single transmon qubit readout, achieving 96% single-shot fidelity with a latency of 32ns and less than 16% FPGA look-up table resource utilization.
Score: 31.339944736585327
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present an end-to-end workflow for superconducting qubit readout that embeds co-designed Neural Networks (NNs) into the Quantum Instrumentation Control Kit (QICK). Capitalizing on the custom firmware and software of the QICK platform, which is built on Xilinx RFSoC FPGAs, we aim to leverage machine learning (ML) to address critical challenges in qubit readout accuracy and scalability. The workflow utilizes the hls4ml package and employs quantization-aware training to translate ML models into hardware-efficient FPGA implementations via user-friendly Python APIs. We experimentally demonstrate the design, optimization, and integration of an ML algorithm for single transmon qubit readout, achieving 96% single-shot fidelity with a latency of 32ns and less than 16% FPGA look-up table resource utilization. Our results offer the community an accessible workflow to advance ML-driven readout and adaptive control in quantum information processing applications.

Related papers

FINN-GL: Generalized Mixed-Precision Extensions for FPGA-Accelerated LSTMs [10.064394911426422]
Recurrent neural networks (RNNs) are effective for time-series tasks like sentiment analysis and short-term stock prediction.<n>Their computational complexity poses challenges for real-time deployment in resource constrained environments.<n>FPGAs offer a promising platform for energy-efficient AI acceleration.
arXiv Detail & Related papers (2025-06-25T20:07:46Z)
RoSTE: An Efficient Quantization-Aware Supervised Fine-Tuning Approach for Large Language Models [53.571195477043496]
We propose an algorithm named Rotated Straight-Through-Estimator (RoSTE)<n>RoSTE combines quantization-aware supervised fine-tuning (QA-SFT) with an adaptive rotation strategy to reduce activation outliers.<n>Our findings reveal that the prediction error is directly proportional to the quantization error of the converged weights, which can be effectively managed through an optimized rotation configuration.
arXiv Detail & Related papers (2025-02-13T06:44:33Z)
Demonstrating the Potential of Adaptive LMS Filtering on FPGA-Based Qubit Control Platforms for Improved Qubit Readout in 2D and 3D Quantum Processing Units [3.348076908667385]
This abstract presents our research intended for optimizing readout pulse fidelity for 2D and 3D Quantum Processing Units (QPUs) We focus on the application of the Least Mean Squares (LMS) adaptive filtering algorithm to enhance the accuracy and efficiency of qubit state detection. Our preliminary results demonstrate the LMS filter's capability to maintain high readout accuracy while efficiently managing FPGA resources.
arXiv Detail & Related papers (2024-08-01T20:42:49Z)
AdaLog: Post-Training Quantization for Vision Transformers with Adaptive Logarithm Quantizer [54.713778961605115]
Vision Transformer (ViT) has become one of the most prevailing fundamental backbone networks in the computer vision community. We propose a novel non-uniform quantizer, dubbed the Adaptive Logarithm AdaLog (AdaLog) quantizer.
arXiv Detail & Related papers (2024-07-17T18:38:48Z)
LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit [55.73370804397226]
Quantization, a key compression technique, can effectively mitigate these demands by compressing and accelerating large language models. We present LLMC, a plug-and-play compression toolkit, to fairly and systematically explore the impact of quantization. Powered by this versatile toolkit, our benchmark covers three key aspects: calibration data, algorithms (three strategies), and data formats.
arXiv Detail & Related papers (2024-05-09T11:49:05Z)
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving [52.31791050376249]
Quantization can accelerate large language model (LLM) inference. We introduce QoQ, a W4A8KV4 quantization algorithm with 4-bit weight, 8-bit activation, and 4-bit KV cache. QServe improves the maximum achievable serving of Llama-3-8B by 1.2x on A100, 1.4x on L40S; and Qwen-721.5B by 2.4x on A100, 3.5x on L40S.
arXiv Detail & Related papers (2024-05-07T17:59:30Z)
Investigating Resource-efficient Neutron/Gamma Classification ML Models Targeting eFPGAs [0.0]
Open-source embedded FPGA (eFPGA) frameworks provide an alternate, more flexible pathway for implementing machine learning models in hardware. We explore the parameter space for eFPGA implementations of fully-connected neural network (fcNN) and boosted decision tree (BDT) models. The results of the study will be used to aid the specification of an eFPGA fabric, which will be integrated as part of a test chip.
arXiv Detail & Related papers (2024-04-19T20:03:30Z)
Trainable Fixed-Point Quantization for Deep Learning Acceleration on FPGAs [30.325651150798915]
Quantization is a crucial technique for deploying deep learning models on resource-constrained devices, such as embedded FPGAs. We present QFX, a trainable fixed-point quantization approach that automatically learns the binary-point position during model training. QFX is implemented as a PyTorch-based library that efficiently emulates fixed-point arithmetic, supported by FPGA HLS.
arXiv Detail & Related papers (2024-01-31T02:18:27Z)
On-Chip Hardware-Aware Quantization for Mixed Precision Neural Networks [52.97107229149988]
We propose an On-Chip Hardware-Aware Quantization framework, performing hardware-aware mixed-precision quantization on deployed edge devices. For efficiency metrics, we built an On-Chip Quantization Aware pipeline, which allows the quantization process to perceive the actual hardware efficiency of the quantization operator. For accuracy metrics, we propose Mask-Guided Quantization Estimation technology to effectively estimate the accuracy impact of operators in the on-chip scenario.
arXiv Detail & Related papers (2023-09-05T04:39:34Z)
A2Q: Accumulator-Aware Quantization with Guaranteed Overflow Avoidance [49.1574468325115]
accumulator-aware quantization (A2Q) is a novel weight quantization method designed to train quantized neural networks (QNNs) to avoid overflow during inference. A2Q introduces a unique formulation inspired by weight normalization that constrains the L1-norm of model weights according to accumulator bit width bounds. We show A2Q can train QNNs for low-precision accumulators while maintaining model accuracy competitive with a floating-point baseline.
arXiv Detail & Related papers (2023-08-25T17:28:58Z)
OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models [57.27101446992148]
Large language models (LLMs) have revolutionized natural language processing tasks. Recent post-training quantization (PTQ) methods are effective in reducing memory footprint and improving the computational efficiency of LLM. We introduce an Omnidirectionally calibrated Quantization technique for LLMs, which achieves good performance in diverse quantization settings.
arXiv Detail & Related papers (2023-08-25T02:28:35Z)
End-to-end codesign of Hessian-aware quantized neural networks for FPGAs and ASICs [49.358119307844035]
We develop an end-to-end workflow for the training and implementation of co-designed neural networks (NNs) This makes efficient NN implementations in hardware accessible to nonexperts, in a single open-sourced workflow. We demonstrate the workflow in a particle physics application involving trigger decisions that must operate at the 40 MHz collision rate of the Large Hadron Collider (LHC) We implement an optimized mixed-precision NN for high-momentum particle jets in simulated LHC proton-proton collisions.
arXiv Detail & Related papers (2023-04-13T18:00:01Z)
Open-source FPGA-ML codesign for the MLPerf Tiny Benchmark [11.575901540758574]
We present our development experience for the Tiny Inference Benchmark on field-programmable gate array (FPGA) platforms. We use the open-source hls4ml and FINN perJ, which aim to democratize AI- hardware codesign of optimized neural networks on FPGAs. The solutions are deployed on system-on-chip (Pynq-Z2) and pure FPGA (Arty A7-100T) platforms.
arXiv Detail & Related papers (2022-06-23T15:57:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.