CSI-4CAST: A Hybrid Deep Learning Model for CSI Prediction with Comprehensive Robustness and Generalization Testing
- URL: http://arxiv.org/abs/2510.12996v1
- Date: Tue, 14 Oct 2025 21:19:52 GMT
- Title: CSI-4CAST: A Hybrid Deep Learning Model for CSI Prediction with Comprehensive Robustness and Generalization Testing
- Authors: Sikai Cheng, Reza Zandehshahvar, Haoruo Zhao, Daniel A. Garcia-Ulloa, Alejandro Villena-Rodriguez, Carles Navarro Manchón, Pascal Van Hentenryck,
- Abstract summary: This paper introduces CSI-4CAST, a hybrid deep learning architecture that integrates 4 key components, i.e., Convolutional neural network residuals, Adaptive correction layers, ShuffleNet blocks, and Transformers.<n>The dataset spans multiple channel models, a wide range of delay spreads and user velocities, and diverse noise types and intensity degrees.<n> Experimental results show that CSI-4CAST achieves superior prediction accuracy with substantially lower computational cost.
- Score: 44.045995554758385
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Channel state information (CSI) prediction is a promising strategy for ensuring reliable and efficient operation of massive multiple-input multiple-output (mMIMO) systems by providing timely downlink (DL) CSI. While deep learning-based methods have advanced beyond conventional model-driven and statistical approaches, they remain limited in robustness to practical non-Gaussian noise, generalization across diverse channel conditions, and computational efficiency. This paper introduces CSI-4CAST, a hybrid deep learning architecture that integrates 4 key components, i.e., Convolutional neural network residuals, Adaptive correction layers, ShuffleNet blocks, and Transformers, to efficiently capture both local and long-range dependencies in CSI prediction. To enable rigorous evaluation, this work further presents a comprehensive benchmark, CSI-RRG for Regular, Robustness and Generalization testing, which includes more than 300,000 samples across 3,060 realistic scenarios for both TDD and FDD systems. The dataset spans multiple channel models, a wide range of delay spreads and user velocities, and diverse noise types and intensity degrees. Experimental results show that CSI-4CAST achieves superior prediction accuracy with substantially lower computational cost, outperforming baselines in 88.9% of TDD scenarios and 43.8% of FDD scenario, the best performance among all evaluated models, while reducing FLOPs by 5x and 3x compared to LLM4CP, the strongest baseline. In addition, evaluation over CSI-RRG provides valuable insights into how different channel factors affect the performance and generalization capability of deep learning models. Both the dataset (https://huggingface.co/CSI-4CAST) and evaluation protocols (https://github.com/AI4OPT/CSI-4CAST) are publicly released to establish a standardized benchmark and to encourage further research on robust and efficient CSI prediction.
Related papers
- Data-Driven Deep MIMO Detection:Network Architectures and Generalization Analysis [50.20709408241935]
This paper proposes inspecting the fully data-driven DeepSIC detection within a Network-of-MLPs architecture.<n>Within such an architecture, DeepSIC can be upgraded as a graph-based message-passing process using Graph Neural Networks (GNNs)<n>GNNSIC achieves excellent expressivity comparable to DeepSIC with substantially fewer trainable parameters.
arXiv Detail & Related papers (2026-02-13T04:38:51Z) - HeterCSI: Channel-Adaptive Heterogeneous CSI Pretraining Framework for Generalized Wireless Foundation Models [24.285127409979342]
HeterCSI is a channel-adaptive pretraining framework that reconciles training efficiency with robust cross-scenario generalization.<n>HeterCSI achieves superior average performance over full-shot baselines.<n>Compared to the state-of-the-art benchmark WiFo, it reduces NMSE by 7.19 dB, 4.08 dB, and 5.27 dB for CSI reconstruction, time-domain, and frequency-domain prediction, respectively.
arXiv Detail & Related papers (2026-01-26T06:35:48Z) - Generative MIMO Beam Map Construction for Location Recovery and Beam Tracking [67.65578956523403]
This paper proposes a generative framework to recover location labels directly from sparse channel state information (CSI) measurements.<n>Instead of directly storing raw CSI, we learn a compact low-dimensional radio map embedding and leverage a generative model to reconstruct the high-dimensional CSI.<n> Numerical experiments demonstrate that the proposed model can improve localization accuracy by over 30% and achieve a 20% capacity gain in non-line-of-sight (NLOS) scenarios.
arXiv Detail & Related papers (2025-11-21T07:25:49Z) - Green Learning for STAR-RIS mmWave Systems with Implicit CSI [53.03358325565645]
Green learning (GL)-based precoding framework is proposed for simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS)-aided millimeter-wave (mmWave) broadcasting systems.<n>Motivated by the emphasis on environmental sustainability in future 6G networks, this work adopts a transmission framework for scenarios where multiple users share identical information, improving spectral efficiency and reducing redundant transmissions and power consumption.
arXiv Detail & Related papers (2025-09-08T15:56:06Z) - Benchmarking Federated Learning for Throughput Prediction in 5G Live Streaming Applications [5.026196568145574]
This paper presents the first comprehensive benchmarking of federated learning strategies for throughput prediction in realistic 5G edge scenarios.<n>It is found that FedBN consistently delivers robust performance under non-IID conditions.<n>LSTM and Transformer models outperform CNN-based baselines by up to 80% in R2 scores.
arXiv Detail & Related papers (2025-08-11T21:27:40Z) - Standards-Compliant DM-RS Allocation via Temporal Channel Prediction for Massive MIMO Systems [4.251030047034567]
We introduce the concept of channel prediction-based reference signal allocation (CPRS)<n>CPRS jointly optimize channel prediction and DM-RS allocation to improve data throughput without requiring CSI feedback.<n>We show up to 36.60% throughput improvement over benchmark strategies.
arXiv Detail & Related papers (2025-07-15T07:56:37Z) - LVM4CSI: Enabling Direct Application of Pre-Trained Large Vision Models for Wireless Channel Tasks [47.223747747750394]
LVM4CSI is a framework that maps complex-valued channel state information to visual formats compatible with computer vision (CV) models.<n>It achieves comparable or superior performance to task-specific neural networks (NNs)<n>It significantly reduces the number of trainable parameters and eliminates the need for task-specific NN design.
arXiv Detail & Related papers (2025-07-07T15:33:55Z) - CSI-BERT2: A BERT-inspired Framework for Efficient CSI Prediction and Classification in Wireless Communication and Sensing [15.607497819907227]
We propose a unified framework named CSI-BERT2 for CSI prediction and classification tasks.<n>We introduce a two-stage training method that first uses a mask language model (MLM) to enable the model to learn general feature extraction from scarce datasets.<n>We also introduce an adaptive re-weighting layer (ARL) to enhance subcarrier representation and a multi-layer perceptron (MLP) based temporal embedding module.
arXiv Detail & Related papers (2024-12-09T06:44:04Z) - GAQAT: gradient-adaptive quantization-aware training for domain generalization [54.31450550793485]
We propose a novel Gradient-Adaptive Quantization-Aware Training (GAQAT) framework for DG.<n>Our approach begins by identifying the scale-gradient conflict problem in low-precision quantization.<n>Extensive experiments validate the effectiveness of the proposed GAQAT framework.
arXiv Detail & Related papers (2024-12-07T06:07:21Z) - Quantize Once, Train Fast: Allreduce-Compatible Compression with Provable Guarantees [53.950234267704]
We introduce Global-QSGD, an All-reduce gradient-compatible quantization method.<n>We show that it accelerates distributed training by up to 3.51% over baseline quantization methods.
arXiv Detail & Related papers (2023-05-29T21:32:15Z) - Semantic Perturbations with Normalizing Flows for Improved
Generalization [62.998818375912506]
We show that perturbations in the latent space can be used to define fully unsupervised data augmentations.
We find that our latent adversarial perturbations adaptive to the classifier throughout its training are most effective.
arXiv Detail & Related papers (2021-08-18T03:20:00Z) - CLNet: Complex Input Lightweight Neural Network designed for Massive
MIMO CSI Feedback [7.63185216082836]
This paper presents a novel neural network CLNet tailored for CSI feedback problem based on the intrinsic properties of CSI.
The experiment result shows that CLNet outperforms the state-of-the-art method by average accuracy improvement of 5.41% in both outdoor and indoor scenarios.
arXiv Detail & Related papers (2021-02-15T12:16:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.