Related papers: VQ-DeepISC: Vector Quantized-Enabled Digital Semantic Communication with Channel Adaptive Image Transmission

VQ-DeepISC: Vector Quantized-Enabled Digital Semantic Communication with Channel Adaptive Image Transmission

URL: http://arxiv.org/abs/2508.03740v1
Date: Fri, 01 Aug 2025 02:35:34 GMT
Title: VQ-DeepISC: Vector Quantized-Enabled Digital Semantic Communication with Channel Adaptive Image Transmission
Authors: Jianqiao Chen, Tingting Zhu, Huishi Song, Nan Ma, Xiaodong Xu,
Abstract summary: Discretization of semantic features enables interoperability between semantic and digital communication systems.<n>We propose a vector quantized (VQ)-enabled digital semantic communication system with channel adaptive image transmission.
Score: 8.858565507331395
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Discretization of semantic features enables interoperability between semantic and digital communication systems, showing significant potential for practical applications. The fundamental difficulty in digitizing semantic features stems from the need to preserve continuity and context in inherently analog representations during their compression into discrete symbols while ensuring robustness to channel degradation. In this paper, we propose a vector quantized (VQ)-enabled digital semantic communication system with channel adaptive image transmission, named VQ-DeepISC. Guided by deep joint source-channel coding (DJSCC), we first design a Swin Transformer backbone for hierarchical semantic feature extraction, followed by VQ modules projecting features into discrete latent spaces. Consequently, it enables efficient index-based transmission instead of raw feature transmission. To further optimize this process, we develop an attention mechanism-driven channel adaptation module to dynamically optimize index transmission. Secondly, to counteract codebook collapse during training process, we impose a distributional regularization by minimizing the Kullback-Leibler divergence (KLD) between codeword usage frequencies and a uniform prior. Meanwhile, exponential moving average (EMA) is employed to stabilize training and ensure balanced feature coverage during codebook updates. Finally, digital communication is implemented using quadrature phase shift keying (QPSK) modulation alongside orthogonal frequency division multiplexing (OFDM), adhering to the IEEE 802.11a standard. Experimental results demonstrate superior reconstruction fidelity of the proposed system over benchmark methods.

Related papers

Scenario-Adaptive MU-MIMO OFDM Semantic Communication With Asymmetric Neural Network [1.8534178102035817]
We propose a scenario-adaptive MU-MIMO SemCom framework featuring an asymmetric architecture tailored for downlink transmission.<n>At the transmitter, we introduce a scenario-aware semantic encoder that dynamically feature extraction based on Channel State Information (CSI) and Signal-to-Noise Ratio (SNR)<n>At the receiver, a lightweight decoder equipped with a novel pilot-guided attention mechanism is employed to implicitly perform channel equalization and feature calibration.
arXiv Detail & Related papers (2026-02-14T02:15:25Z)
VQ-DSC-R: Robust Vector Quantized-Enabled Digital Semantic Communication With OFDM Transmission [24.90644167978418]
We develop a robust vector quantized-enabled digital semantic communication (VQ-DSC-R) system built upon frequency division multiplexing (OFDM) transmission.<n>Our work encompasses the framework design of VQ-DSC-R, followed by a comprehensive optimization study.<n>Experiments demonstrate superiority of VQ-DSC-R over benchmark schemes, achieving high compression ratios and robust performance in practical scenarios.
arXiv Detail & Related papers (2026-02-05T02:53:28Z)
Context Video Semantic Transmission with Variable Length and Rate Coding over MIMO Channels [49.624608869195065]
We propose the context video semantic transmission (CVST) framework for wireless video transmission.<n>We learn a context-channel correlation map to explicitly formulate the relationships between feature groups and multiple input multiple output (MIMO) subchannels.<n>We demonstrate substantial performance gains over various standardized separated coding methods and recent wireless video semantic communication approaches.
arXiv Detail & Related papers (2025-12-23T10:48:43Z)
Channel-Aware Vector Quantization for Robust Semantic Communication on Discrete Channels [5.680520767606761]
We propose a channel-aware vector quantization (CAVQ) algorithm within a joint source-channel coding framework, termed VQJSCC.<n>In this framework, semantic features are discretized and directly mapped to modulation constellation symbols, while CAVQ integrates channel transition probabilities into the quantization process.<n>A multi-codebook alignment mechanism is also introduced to handle mismatches between codebook order and modulation order by decomposing the transmission stream into subchannels.
arXiv Detail & Related papers (2025-10-21T13:02:35Z)
Adaptive Source-Channel Coding for Semantic Communications [48.13990936094994]
Semantic communications (SemComs) have emerged as a promising paradigm for joint data and task-oriented transmissions.<n>Current joint source-channel coding (JSCC) in SemComs is not compatible with the existing communication systems.<n>We propose an adaptive source-channel coding scheme for SemComs over parallel Gaussian channels.
arXiv Detail & Related papers (2025-08-11T13:09:54Z)
Modeling and Performance Analysis for Semantic Communications Based on Empirical Results [53.805458017074294]
We propose an Alpha-Beta-Gamma (ABG) formula to model the relationship between the end-to-end measurement and SNR.<n>For image reconstruction tasks, the proposed ABG formula can well fit the commonly used DL networks, such as SCUNet, and Vision Transformer.<n>To the best of our knowledge, this is the first theoretical expression between end-to-end performance metrics and SNR for semantic communications.
arXiv Detail & Related papers (2025-04-29T06:07:50Z)
Vision Transformer-based Semantic Communications With Importance-Aware Quantization [13.328970689723096]
This paper presents a vision transformer (ViT)-based semantic communication system with importance-aware quantization (IAQ) for wireless image transmission.<n>We show that our IAQ framework outperforms conventional image compression methods in both error-free and realistic communication scenarios.
arXiv Detail & Related papers (2024-12-08T19:24:47Z)
VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing [81.32613443072441]
For tasks such as text-to-speech (TTS), voice conversion (VC), and automatic speech recognition (ASR), a cross-modal fine-grained (frame-level) sequence representation is desired.<n>We propose a method called Quantized Contrastive Token-Acoustic Pre-training (VQ-CTAP), which uses the cross-modal sequence transcoder to bring text and speech into a joint space.
arXiv Detail & Related papers (2024-08-11T12:24:23Z)
Diffusion-Driven Semantic Communication for Generative Models with Bandwidth Constraints [66.63250537475973]
This paper introduces a diffusion-driven semantic communication framework with advanced VAE-based compression for bandwidth-constrained generative model.<n>Our experimental results demonstrate significant improvements in pixel-level metrics like peak signal to noise ratio (PSNR) and semantic metrics like learned perceptual image patch similarity (LPIPS)
arXiv Detail & Related papers (2024-07-26T02:34:25Z)
MOC-RVQ: Multilevel Codebook-Assisted Digital Generative Semantic Communication [43.17888320268593]
We propose a multilevel generative semantic communication system with a two-stage training framework. In the first stage, we train a high-quality codebook, using a multi-head octonary codebook to compress the index range. In the second stage, a noise reduction block (NRB) based on Swin Transformer is introduced, serving as a high-quality semantic knowledge base.
arXiv Detail & Related papers (2024-01-02T16:17:43Z)
Communication-Efficient Framework for Distributed Image Semantic Wireless Transmission [68.69108124451263]
Federated learning-based semantic communication (FLSC) framework for multi-task distributed image transmission with IoT devices. Each link is composed of a hierarchical vision transformer (HVT)-based extractor and a task-adaptive translator. Channel state information-based multiple-input multiple-output transmission module designed to combat channel fading and noise.
arXiv Detail & Related papers (2023-08-07T16:32:14Z)
Joint Channel Estimation and Feedback with Masked Token Transformers in Massive MIMO Systems [74.52117784544758]
This paper proposes an encoder-decoder based network that unveils the intrinsic frequency-domain correlation within the CSI matrix. The entire encoder-decoder network is utilized for channel compression. Our method outperforms state-of-the-art channel estimation and feedback techniques in joint tasks.
arXiv Detail & Related papers (2023-06-08T06:15:17Z)
Vector Quantized Semantic Communication System [22.579525825992416]
We develop a deep learning-enabled vector quantized (VQ) semantic communication system for image transmission, named VQ-DeepSC. Specifically, we propose a CNN-based transceiver to extract multi-scale semantic features of images and introduce multi-scale semantic embedding spaces. We employ adversarial training to improve the quality of received images by introducing a PatchGAN discriminator.
arXiv Detail & Related papers (2022-09-23T10:58:23Z)
DeepJSCC-Q: Constellation Constrained Deep Joint Source-Channel Coding [6.55705721360334]
We show that DeepJSCC-Q can achieve similar performance to prior works that allow any complex valued channel input. DeepJSCC-Q preserves the graceful degradation of image quality in unpredictable channel conditions.
arXiv Detail & Related papers (2022-06-16T11:43:50Z)
Nonlinear Transform Source-Channel Coding for Semantic Communications [7.81628437543759]
We propose a new class of high-efficient deep joint source-channel coding methods that can closely adapt to the source distribution under the nonlinear transform. Our model incorporates the nonlinear transform as a strong prior to effectively extract the source semantic features. Notably, the proposed NTSCC method can potentially support future semantic communications due to its vigorous content-aware ability.
arXiv Detail & Related papers (2021-12-21T03:30:46Z)
Volumetric Transformer Networks [88.85542905676712]
We introduce a learnable module, the volumetric transformer network (VTN) VTN predicts channel-wise warping fields so as to reconfigure intermediate CNN features spatially and channel-wisely. Our experiments show that VTN consistently boosts the features' representation power and consequently the networks' accuracy on fine-grained image recognition and instance-level image retrieval.
arXiv Detail & Related papers (2020-07-18T14:00:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.