Turbo-ICL: In-Context Learning-Based Turbo Equalization
- URL: http://arxiv.org/abs/2505.06175v1
- Date: Fri, 09 May 2025 16:29:29 GMT
- Title: Turbo-ICL: In-Context Learning-Based Turbo Equalization
- Authors: Zihang Song, Matteo Zecchin, Bipin Rajendran, Osvaldo Simeone,
- Abstract summary: This paper introduces a novel in-context learning framework, inspired by large language models (LLMs)<n>It learns to infer posterior symbol distributions directly from a prompt of pilot signals and decoder feedback.<n>Two model variants, based on Transformer and state-space architectures, are developed and evaluated.
- Score: 30.918855018647385
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper introduces a novel in-context learning (ICL) framework, inspired by large language models (LLMs), for soft-input soft-output channel equalization in coded multiple-input multiple-output (MIMO) systems. The proposed approach learns to infer posterior symbol distributions directly from a prompt of pilot signals and decoder feedback. A key innovation is the use of prompt augmentation to incorporate extrinsic information from the decoder output as additional context, enabling the ICL model to refine its symbol estimates iteratively across turbo decoding iterations. Two model variants, based on Transformer and state-space architectures, are developed and evaluated. Extensive simulations demonstrate that, when traditional linear assumptions break down, e.g., in the presence of low-resolution quantization, ICL equalizers consistently outperform conventional model-based baselines, even when the latter are provided with perfect channel state information. Results also highlight the advantage of Transformer-based models under limited training diversity, as well as the efficiency of state-space models in resource-constrained scenarios.
Related papers
- Leveraging Importance Sampling to Detach Alignment Modules from Large Language Models [50.19188692497892]
Traditional alignment methods often require retraining large pretrained models.<n>We propose a novel textitResidual Alignment Model (textitRAM) that formalizes the alignment process as a type of importance sampling.<n>We develop a resampling algorithm with iterative token-level decoding to address the common first-token latency issue in comparable methods.
arXiv Detail & Related papers (2025-05-26T08:53:02Z) - Scalable Language Models with Posterior Inference of Latent Thought Vectors [52.63299874322121]
Latent-Thought Language Models (LTMs) incorporate explicit latent thought vectors that follow an explicit prior model in latent space.<n>LTMs possess additional scaling dimensions beyond traditional LLMs, yielding a structured design space.<n>LTMs significantly outperform conventional autoregressive models and discrete diffusion models in validation perplexity and zero-shot language modeling.
arXiv Detail & Related papers (2025-02-03T17:50:34Z) - Emergence of Abstractions: Concept Encoding and Decoding Mechanism for In-Context Learning in Transformers [18.077009146950473]
Autoregressive transformers exhibit adaptive learning through in-context learning (ICL)<n>We propose concept encoding-decoding mechanism to explain ICL by studying how transformers form and use internal abstractions in their representations.<n>Our empirical insights shed light into better understanding the success and failure modes of large language models via their representations.
arXiv Detail & Related papers (2024-12-16T19:00:18Z) - The Graph's Apprentice: Teaching an LLM Low Level Knowledge for Circuit Quality Estimation [34.37154877681809]
This work proposes augmenting large language models (LLMs) with predictor networks trained to estimate circuit quality directly from HDL code.<n>To enhance performance, the model is regularized using embeddings from graph neural networks (GNNs) trained on Look-Up Table (LUT) graphs.<n>The proposed method demonstrates superior performance compared to existing graph-based RTL-level estimation techniques on the established benchmark OpenABCD.
arXiv Detail & Related papers (2024-10-30T04:20:10Z) - SOLO: A Single Transformer for Scalable Vision-Language Modeling [74.05173379908703]
We present SOLO, a single transformer for visiOn-Language mOdeling.<n>A unified single Transformer architecture, like SOLO, effectively addresses these scalability concerns in LVLMs.<n>In this paper, we introduce the first open-source training recipe for developing SOLO, an open-source 7B LVLM.
arXiv Detail & Related papers (2024-07-08T22:40:15Z) - Neural Data-Dependent Transform for Learned Image Compression [72.86505042102155]
We build a neural data-dependent transform and introduce a continuous online mode decision mechanism to jointly optimize the coding efficiency for each individual image.
The experimental results show the effectiveness of the proposed neural-syntax design and the continuous online mode decision mechanism.
arXiv Detail & Related papers (2022-03-09T14:56:48Z) - Streaming Multi-Talker ASR with Token-Level Serialized Output Training [53.11450530896623]
t-SOT is a novel framework for streaming multi-talker automatic speech recognition.
The t-SOT model has the advantages of less inference cost and a simpler model architecture.
For non-overlapping speech, the t-SOT model is on par with a single-talker ASR model in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2022-02-02T01:27:21Z) - Deliberation of Streaming RNN-Transducer by Non-autoregressive Decoding [21.978994865937786]
The method performs a few refinement steps, where each step shares a transformer decoder that attends to both text features and audio features.
We show that, conditioned on hypothesis alignments of a streaming RNN-T model, our method obtains significantly more accurate recognition results than the first-pass RNN-T.
arXiv Detail & Related papers (2021-12-01T01:34:28Z) - Cross-Modal Transformer-Based Neural Correction Models for Automatic
Speech Recognition [31.2558640840697]
We propose a cross-modal transformer-based neural correction models that refines the output of an automatic speech recognition system.
Experiments on Japanese natural language ASR tasks demonstrated that our proposed models achieve better ASR performance than conventional neural correction models.
arXiv Detail & Related papers (2021-07-04T07:58:31Z) - Non-autoregressive Transformer-based End-to-end ASR using BERT [13.07939371864781]
This paper presents a transformer-based end-to-end automatic speech recognition (ASR) model based on BERT.
A series of experiments conducted on the AISHELL-1 dataset demonstrates competitive or superior results.
arXiv Detail & Related papers (2021-04-10T16:22:17Z) - Transformer-based ASR Incorporating Time-reduction Layer and Fine-tuning
with Self-Knowledge Distillation [11.52842516726486]
We propose a Transformer-based ASR model with the time reduction layer, in which we incorporate time reduction layer inside transformer encoder layers.
We also introduce a fine-tuning approach for pre-trained ASR models using self-knowledge distillation (S-KD) which further improves the performance of our ASR model.
With language model (LM) fusion, we achieve new state-of-the-art word error rate (WER) results for Transformer-based ASR models.
arXiv Detail & Related papers (2021-03-17T21:02:36Z) - Improve Variational Autoencoder for Text Generationwith Discrete Latent
Bottleneck [52.08901549360262]
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning.
VAEs tend to ignore latent variables with a strong auto-regressive decoder.
We propose a principled approach to enforce an implicit latent feature matching in a more compact latent space.
arXiv Detail & Related papers (2020-04-22T14:41:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.