CrossMPT: Cross-attention Message-Passing Transformer for Error Correcting Codes
- URL: http://arxiv.org/abs/2405.01033v2
- Date: Fri, 11 Oct 2024 04:42:49 GMT
- Title: CrossMPT: Cross-attention Message-Passing Transformer for Error Correcting Codes
- Authors: Seong-Joon Park, Hee-Youl Kwak, Sang-Hyo Kim, Yongjune Kim, Jong-Seon No,
- Abstract summary: We propose a novel Cross-attention Message-Passing Transformer (CrossMPT)
We show that CrossMPT significantly outperforms existing neural network-based decoders for various code classes.
Notably, CrossMPT achieves this decoding performance improvement, while significantly reducing the memory usage, complexity, inference time, and training time.
- Score: 14.631435001491514
- License:
- Abstract: Error correcting codes (ECCs) are indispensable for reliable transmission in communication systems. The recent advancements in deep learning have catalyzed the exploration of ECC decoders based on neural networks. Among these, transformer-based neural decoders have achieved state-of-the-art decoding performance. In this paper, we propose a novel Cross-attention Message-Passing Transformer (CrossMPT), which shares key operational principles with conventional message-passing decoders. While conventional transformer-based decoders employ self-attention mechanism without distinguishing between the types of input vectors (i.e., magnitude and syndrome vectors), CrossMPT updates the two types of input vectors separately and iteratively using two masked cross-attention blocks. The mask matrices are determined by the code's parity-check matrix, which explicitly captures the irrelevant relationship between two input vectors. Our experimental results show that CrossMPT significantly outperforms existing neural network-based decoders for various code classes. Notably, CrossMPT achieves this decoding performance improvement, while significantly reducing the memory usage, complexity, inference time, and training time.
Related papers
- On the Design and Performance of Machine Learning Based Error Correcting Decoders [3.8289109929360245]
We first consider the so-called single-label neural network (SLNN) and the multi-label neural network (MLNN) decoders which have been reported to achieve near maximum likelihood (ML) performance.
We then turn our attention to two transformer-based decoders: the error correction code transformer (ECCT) and the cross-attention message passing transformer (CrossMPT)
arXiv Detail & Related papers (2024-10-21T11:23:23Z) - Accelerating Error Correction Code Transformers [56.75773430667148]
We introduce a novel acceleration method for transformer-based decoders.
We achieve a 90% compression ratio and reduce arithmetic operation energy consumption by at least 224 times on modern hardware.
arXiv Detail & Related papers (2024-10-08T11:07:55Z) - Learning Linear Block Error Correction Codes [62.25533750469467]
We propose for the first time a unified encoder-decoder training of binary linear block codes.
We also propose a novel Transformer model in which the self-attention masking is performed in a differentiable fashion for the efficient backpropagation of the code gradient.
arXiv Detail & Related papers (2024-05-07T06:47:12Z) - How to Mask in Error Correction Code Transformer: Systematic and Double
Masking [16.90917067964835]
In communication and storage systems, error correction codes (ECCs) are pivotal in ensuring data reliability.
Error Correction Code Transformer (ECCT) has achieved the state-of-the-art performance, outperforming other methods by large margins.
We propose a new masking matrix for ECCT, aiming to improve the performance and reduce the computational complexity.
arXiv Detail & Related papers (2023-08-16T03:35:52Z) - DeMT: Deformable Mixer Transformer for Multi-Task Learning of Dense
Prediction [40.447092963041236]
We present a novel MTL model by combining both merits of deformable CNN and query-based Transformer.
Our method, named DeMT, is based on a simple and effective encoder-decoder architecture.
Our model uses fewer GFLOPs and significantly outperforms current Transformer- and CNN-based competitive models.
arXiv Detail & Related papers (2023-01-09T16:00:15Z) - Efficient Nearest Neighbor Search for Cross-Encoder Models using Matrix
Factorization [60.91600465922932]
We present an approach that avoids the use of a dual-encoder for retrieval, relying solely on the cross-encoder.
Our approach provides test-time recall-vs-computational cost trade-offs superior to the current widely-used methods.
arXiv Detail & Related papers (2022-10-23T00:32:04Z) - Error Correction Code Transformer [92.10654749898927]
We propose to extend for the first time the Transformer architecture to the soft decoding of linear codes at arbitrary block lengths.
We encode each channel's output dimension to high dimension for better representation of the bits information to be processed separately.
The proposed approach demonstrates the extreme power and flexibility of Transformers and outperforms existing state-of-the-art neural decoders by large margins at a fraction of their time complexity.
arXiv Detail & Related papers (2022-03-27T15:25:58Z) - Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with
Non-Autoregressive Hidden Intermediates [59.678108707409606]
We propose Fast-MD, a fast MD model that generates HI by non-autoregressive decoding based on connectionist temporal classification (CTC) outputs followed by an ASR decoder.
Fast-MD achieved about 2x and 4x faster decoding speed than that of the na"ive MD model on GPU and CPU with comparable translation quality.
arXiv Detail & Related papers (2021-09-27T05:21:30Z) - Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation [63.46694853953092]
Swin-Unet is an Unet-like pure Transformer for medical image segmentation.
tokenized image patches are fed into the Transformer-based U-shaped decoder-Decoder architecture.
arXiv Detail & Related papers (2021-05-12T09:30:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.