Related papers: Generalization Bounds for Transformer Channel Decoders

Generalization Bounds for Transformer Channel Decoders

URL: http://arxiv.org/abs/2601.06969v1
Date: Sun, 11 Jan 2026 15:56:37 GMT
Title: Generalization Bounds for Transformer Channel Decoders
Authors: Qinshan Zhang, Bin Chen, Yong Jiang, Shu-Tao Xia,
Abstract summary: This paper studies the generalization performance of ECCT from a learning-theoretic perspective.<n>To the best of our knowledge, this work provides the first theoretical generalization guarantees for this class of decoders.
Score: 61.55280736553095
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Transformer channel decoders, such as the Error Correction Code Transformer (ECCT), have shown strong empirical performance in channel decoding, yet their generalization behavior remains theoretically unclear. This paper studies the generalization performance of ECCT from a learning-theoretic perspective. By establishing a connection between multiplicative noise estimation errors and bit-error-rate (BER), we derive an upper bound on the generalization gap via bit-wise Rademacher complexity. The resulting bound characterizes the dependence on code length, model parameters, and training set size, and applies to both single-layer and multi-layer ECCTs. We further show that parity-check-based masked attention induces sparsity that reduces the covering number, leading to a tighter generalization bound. To the best of our knowledge, this work provides the first theoretical generalization guarantees for this class of decoders.

Related papers

The Equalizer: Introducing Shape-Gain Decomposition in Neural Audio Codecs [20.468614667204093]
We propose to introduce shape-gain decomposition, widely used in classical speech/audio coding, into the NAC framework.<n>The proposed Equalizer methodology is to decompose the input signal -- before the NAC encoder -- into gain and normalized shape vector on a short-term basis.<n>Our experiments conducted on speech signals show that this general methodology, easily applicable to any NAC, enables a substantial gain in-distortion performance.
arXiv Detail & Related papers (2026-02-17T10:59:33Z)
Joint Source-Channel-Generation Coding: From Distortion-oriented Reconstruction to Semantic-consistent Generation [58.67925548779465]
We propose Joint Source-Channel-Generation Coding (JSCGC), a novel paradigm that shifts the focus from perceptual reconstruction to probabilistic generation.<n>JSCGC improves substantially semantic quality and semantic fidelity, significantly outperforming conventional distortion-oriented J SCC methods.
arXiv Detail & Related papers (2026-01-19T08:12:47Z)
Fast correlated decoding of transversal logical algorithms [67.01652927671279]
Quantum error correction (QEC) is required for large-scale computation, but incurs a significant resource overhead.<n>Recent advances have shown that by jointly decoding logical qubits in algorithms composed of logical gates, the number of syndrome extraction rounds can be reduced.<n>Here, we reform the problem of decoding circuits by directly decoding relevant logical operator products as they propagate through the circuit.
arXiv Detail & Related papers (2025-05-19T18:00:00Z)
A Theoretical Perspective for Speculative Decoding Algorithm [60.79447486066416]
One effective way to accelerate inference is emphSpeculative Decoding, which employs a small model to sample a sequence of draft tokens and a large model to validate. This paper tackles this gap by conceptualizing the decoding problem via markov chain abstraction and studying the key properties, emphoutput quality and inference acceleration, from a theoretical perspective.
arXiv Detail & Related papers (2024-10-30T01:53:04Z)
Accelerating Error Correction Code Transformers [56.75773430667148]
We introduce a novel acceleration method for transformer-based decoders. We achieve a 90% compression ratio and reduce arithmetic operation energy consumption by at least 224 times on modern hardware.
arXiv Detail & Related papers (2024-10-08T11:07:55Z)
Generalization Bounds for Neural Belief Propagation Decoders [10.96453955114324]
In this paper, we investigate the generalization capabilities of neural network based decoders. Specifically, the generalization gap of a decoder is the difference between empirical and expected bit-error-rate(s) Results are presented for both regular and irregular parity-check matrices.
arXiv Detail & Related papers (2023-05-17T19:56:04Z)
Error Correction Code Transformer [92.10654749898927]
We propose to extend for the first time the Transformer architecture to the soft decoding of linear codes at arbitrary block lengths. We encode each channel's output dimension to high dimension for better representation of the bits information to be processed separately. The proposed approach demonstrates the extreme power and flexibility of Transformers and outperforms existing state-of-the-art neural decoders by large margins at a fraction of their time complexity.
arXiv Detail & Related papers (2022-03-27T15:25:58Z)
Infomax Neural Joint Source-Channel Coding via Adversarial Bit Flip [41.28049430114734]
We propose a novel regularization method called Infomax Adversarial-Bit-Flip (IABF) to improve the stability and robustness of the neural joint source-channel coding scheme. Our IABF can achieve state-of-the-art performances on both compression and error correction benchmarks and outperform the baselines by a significant margin.
arXiv Detail & Related papers (2020-04-03T10:00:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.