Even More Efficient Soft-Output Decoding with Extra-Cluster Growth and Early Stopping
- URL: http://arxiv.org/abs/2602.03336v1
- Date: Tue, 03 Feb 2026 10:00:40 GMT
- Title: Even More Efficient Soft-Output Decoding with Extra-Cluster Growth and Early Stopping
- Authors: Kaito Kishi, Riki Toshio, Jun Fujisaki, Hirotaka Oshima, Shintaro Sato, Keisuke Fujii,
- Abstract summary: We introduce two types of novel soft-outputs: the bounded cluster gap and the extra-cluster gap.<n>The latter, the extra-cluster gap, quantifies decoder reliability by performing a small, additional growth of the clusters obtained by the decoder.<n>These techniques offer lower computational complexity and higher hardware compatibility.
- Score: 2.370310454459195
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In fault-tolerant quantum computing, soft outputs from real-time decoders play a crucial role in improving decoding accuracy, post-selecting magic states, and accelerating lattice surgery. A recent paper by Meister et al. [arXiv:2405.07433 (2024)] proposed an efficient method to evaluate soft outputs for cluster-based decoders, including the Union-Find (UF) decoder. However, in parallel computing environments, its computational complexity is comparable to or even surpasses that of the UF decoder itself, resulting in a substantial overhead. Furthermore, this method requires global information about the decoding graph, making it poorly suited for existing hardware implementations of the UF decoder on Field-Programmable Gate Arrays (FPGAs). In this paper, to alleviate these issues, we develop more efficient methods for evaluating high-quality soft outputs in cluster-based decoders by introducing several early-stopping techniques. Our central idea is that the precise value of a large soft output is often unnecessary in practice. Based on this insight, we introduce two types of novel soft-outputs: the bounded cluster gap and the extra-cluster gap. The former reduces the computational complexity of Meister's method by terminating the calculation at an early stage. Our numerical simulations show that this method achieves improved scaling with code distance $d$ compared to the original proposal. The latter, the extra-cluster gap, quantifies decoder reliability by performing a small, additional growth of the clusters obtained by the decoder. This approach offers the significant advantage of enabling soft-output computation without modifying the existing architecture of FPGA-implemented UF decoders. These techniques offer lower computational complexity and higher hardware compatibility, laying a crucial foundation for future real-time decoders with soft outputs.
Related papers
- FPGA-tailored algorithms for real-time decoding of quantum LDPC codes [1.213715600410032]
We analyze FPGA-tailored versions of three decoder classes for quantum low-density parity-check (qLDPC) codes.<n>For message passing, we analyze the recently introduced Relay decoder and its FPGA implementation.<n>For ordered statistics decoding, we introduce a filtered variant that concentrates on high-likelihood fault locations.<n>We design an FPGA-adapted generalized union-find decoder.
arXiv Detail & Related papers (2025-11-26T18:33:47Z) - Fast correlated decoding of transversal logical algorithms [67.01652927671279]
Quantum error correction (QEC) is required for large-scale computation, but incurs a significant resource overhead.<n>Recent advances have shown that by jointly decoding logical qubits in algorithms composed of logical gates, the number of syndrome extraction rounds can be reduced.<n>Here, we reform the problem of decoding circuits by directly decoding relevant logical operator products as they propagate through the circuit.
arXiv Detail & Related papers (2025-05-19T18:00:00Z) - Accelerating Error Correction Code Transformers [56.75773430667148]
We introduce a novel acceleration method for transformer-based decoders.
We achieve a 90% compression ratio and reduce arithmetic operation energy consumption by at least 224 times on modern hardware.
arXiv Detail & Related papers (2024-10-08T11:07:55Z) - Highly Efficient Parallel Row-Layered Min-Sum MDPC Decoder for McEliece Cryptosystem [6.583725235299022]
The medium-density parity-check (MDPC) code-based McEliece cryptosystem remains a finalist of the post-quantum cryptography standard.
The Min-sum decoding algorithm achieves better performance-complexity tradeoff than other algorithms for MDPC codes.
For the first time, the row-layered scheduling scheme is exploited to substantially reduce the memory requirement of MDPC decoders.
arXiv Detail & Related papers (2024-07-17T16:19:42Z) - Efficient Encoder-Decoder Transformer Decoding for Decomposable Tasks [53.550782959908524]
We introduce a new configuration for encoder-decoder models that improves efficiency on structured output and decomposable tasks.
Our method, prompt-in-decoder (PiD), encodes the input once and decodes the output in parallel, boosting both training and inference efficiency.
arXiv Detail & Related papers (2024-03-19T19:27:23Z) - Extreme Encoder Output Frame Rate Reduction: Improving Computational
Latencies of Large End-to-End Models [59.57732929473519]
We apply multiple frame reduction layers in the encoder to compress encoder outputs into a small number of output frames.
We demonstrate that we can generate one encoder output frame for every 2.56 sec of input speech, without significantly affecting word error rate on a large-scale voice search task.
arXiv Detail & Related papers (2024-02-27T03:40:44Z) - You Need Multiple Exiting: Dynamic Early Exiting for Accelerating
Unified Vision Language Model [37.24203191658052]
Large-scale Transformer models bring significant improvements for various downstream vision language tasks with a unified architecture.
Performance improvements come with increasing model size, resulting in slow inference speed and increased cost for severing.
We propose a novel early exiting strategy for unified visual language models, which allows dynamically skip the layers in encoder and decoder simultaneously.
arXiv Detail & Related papers (2022-11-21T02:32:25Z) - Efficient Nearest Neighbor Search for Cross-Encoder Models using Matrix
Factorization [60.91600465922932]
We present an approach that avoids the use of a dual-encoder for retrieval, relying solely on the cross-encoder.
Our approach provides test-time recall-vs-computational cost trade-offs superior to the current widely-used methods.
arXiv Detail & Related papers (2022-10-23T00:32:04Z) - EfficientFCN: Holistically-guided Decoding for Semantic Segmentation [49.27021844132522]
State-of-the-art semantic segmentation algorithms are mostly based on dilated Fully Convolutional Networks (dilatedFCN)
We propose the EfficientFCN, whose backbone is a common ImageNet pre-trained network without any dilated convolution.
Such a framework achieves comparable or even better performance than state-of-the-art methods with only 1/3 of the computational cost.
arXiv Detail & Related papers (2020-08-24T14:48:23Z) - Communication-Efficient Gradient Coding for Straggler Mitigation in
Distributed Learning [17.454251607446555]
Distributed implementations of gradient-based methods, wherein a server distributes gradient computations across worker machines, need to overcome two limitations.
Ye and Abbe [ICML 2018] proposed a coding-theoretic paradigm to characterize a fundamental trade-off between computation load per worker, communication overhead per worker, and straggler tolerance.
We develop a communication-efficient gradient coding framework to overcome these drawbacks.
arXiv Detail & Related papers (2020-05-14T17:57:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.