VQ-DeepISC: Vector Quantized-Enabled Digital Semantic Communication with Channel Adaptive Image Transmission
- URL: http://arxiv.org/abs/2508.03740v1
- Date: Fri, 01 Aug 2025 02:35:34 GMT
- Title: VQ-DeepISC: Vector Quantized-Enabled Digital Semantic Communication with Channel Adaptive Image Transmission
- Authors: Jianqiao Chen, Tingting Zhu, Huishi Song, Nan Ma, Xiaodong Xu,
- Abstract summary: Discretization of semantic features enables interoperability between semantic and digital communication systems.<n>We propose a vector quantized (VQ)-enabled digital semantic communication system with channel adaptive image transmission.
- Score: 8.858565507331395
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Discretization of semantic features enables interoperability between semantic and digital communication systems, showing significant potential for practical applications. The fundamental difficulty in digitizing semantic features stems from the need to preserve continuity and context in inherently analog representations during their compression into discrete symbols while ensuring robustness to channel degradation. In this paper, we propose a vector quantized (VQ)-enabled digital semantic communication system with channel adaptive image transmission, named VQ-DeepISC. Guided by deep joint source-channel coding (DJSCC), we first design a Swin Transformer backbone for hierarchical semantic feature extraction, followed by VQ modules projecting features into discrete latent spaces. Consequently, it enables efficient index-based transmission instead of raw feature transmission. To further optimize this process, we develop an attention mechanism-driven channel adaptation module to dynamically optimize index transmission. Secondly, to counteract codebook collapse during training process, we impose a distributional regularization by minimizing the Kullback-Leibler divergence (KLD) between codeword usage frequencies and a uniform prior. Meanwhile, exponential moving average (EMA) is employed to stabilize training and ensure balanced feature coverage during codebook updates. Finally, digital communication is implemented using quadrature phase shift keying (QPSK) modulation alongside orthogonal frequency division multiplexing (OFDM), adhering to the IEEE 802.11a standard. Experimental results demonstrate superior reconstruction fidelity of the proposed system over benchmark methods.
Related papers
- Modeling and Performance Analysis for Semantic Communications Based on Empirical Results [53.805458017074294]
We propose an Alpha-Beta-Gamma (ABG) formula to model the relationship between the end-to-end measurement and SNR.<n>For image reconstruction tasks, the proposed ABG formula can well fit the commonly used DL networks, such as SCUNet, and Vision Transformer.<n>To the best of our knowledge, this is the first theoretical expression between end-to-end performance metrics and SNR for semantic communications.
arXiv Detail & Related papers (2025-04-29T06:07:50Z) - Vision Transformer-based Semantic Communications With Importance-Aware Quantization [13.328970689723096]
This paper presents a vision transformer (ViT)-based semantic communication system with importance-aware quantization (IAQ) for wireless image transmission.<n>We show that our IAQ framework outperforms conventional image compression methods in both error-free and realistic communication scenarios.
arXiv Detail & Related papers (2024-12-08T19:24:47Z) - VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing [81.32613443072441]
For tasks such as text-to-speech (TTS), voice conversion (VC), and automatic speech recognition (ASR), a cross-modal fine-grained (frame-level) sequence representation is desired.<n>We propose a method called Quantized Contrastive Token-Acoustic Pre-training (VQ-CTAP), which uses the cross-modal sequence transcoder to bring text and speech into a joint space.
arXiv Detail & Related papers (2024-08-11T12:24:23Z) - Diffusion-Driven Semantic Communication for Generative Models with Bandwidth Constraints [66.63250537475973]
This paper introduces a diffusion-driven semantic communication framework with advanced VAE-based compression for bandwidth-constrained generative model.<n>Our experimental results demonstrate significant improvements in pixel-level metrics like peak signal to noise ratio (PSNR) and semantic metrics like learned perceptual image patch similarity (LPIPS)
arXiv Detail & Related papers (2024-07-26T02:34:25Z) - MOC-RVQ: Multilevel Codebook-Assisted Digital Generative Semantic Communication [43.17888320268593]
We propose a multilevel generative semantic communication system with a two-stage training framework.
In the first stage, we train a high-quality codebook, using a multi-head octonary codebook to compress the index range.
In the second stage, a noise reduction block (NRB) based on Swin Transformer is introduced, serving as a high-quality semantic knowledge base.
arXiv Detail & Related papers (2024-01-02T16:17:43Z) - Communication-Efficient Framework for Distributed Image Semantic
Wireless Transmission [68.69108124451263]
Federated learning-based semantic communication (FLSC) framework for multi-task distributed image transmission with IoT devices.
Each link is composed of a hierarchical vision transformer (HVT)-based extractor and a task-adaptive translator.
Channel state information-based multiple-input multiple-output transmission module designed to combat channel fading and noise.
arXiv Detail & Related papers (2023-08-07T16:32:14Z) - Joint Channel Estimation and Feedback with Masked Token Transformers in
Massive MIMO Systems [74.52117784544758]
This paper proposes an encoder-decoder based network that unveils the intrinsic frequency-domain correlation within the CSI matrix.
The entire encoder-decoder network is utilized for channel compression.
Our method outperforms state-of-the-art channel estimation and feedback techniques in joint tasks.
arXiv Detail & Related papers (2023-06-08T06:15:17Z) - Vector Quantized Semantic Communication System [22.579525825992416]
We develop a deep learning-enabled vector quantized (VQ) semantic communication system for image transmission, named VQ-DeepSC.
Specifically, we propose a CNN-based transceiver to extract multi-scale semantic features of images and introduce multi-scale semantic embedding spaces.
We employ adversarial training to improve the quality of received images by introducing a PatchGAN discriminator.
arXiv Detail & Related papers (2022-09-23T10:58:23Z) - DeepJSCC-Q: Constellation Constrained Deep Joint Source-Channel Coding [6.55705721360334]
We show that DeepJSCC-Q can achieve similar performance to prior works that allow any complex valued channel input.
DeepJSCC-Q preserves the graceful degradation of image quality in unpredictable channel conditions.
arXiv Detail & Related papers (2022-06-16T11:43:50Z) - Nonlinear Transform Source-Channel Coding for Semantic Communications [7.81628437543759]
We propose a new class of high-efficient deep joint source-channel coding methods that can closely adapt to the source distribution under the nonlinear transform.
Our model incorporates the nonlinear transform as a strong prior to effectively extract the source semantic features.
Notably, the proposed NTSCC method can potentially support future semantic communications due to its vigorous content-aware ability.
arXiv Detail & Related papers (2021-12-21T03:30:46Z) - Volumetric Transformer Networks [88.85542905676712]
We introduce a learnable module, the volumetric transformer network (VTN)
VTN predicts channel-wise warping fields so as to reconfigure intermediate CNN features spatially and channel-wisely.
Our experiments show that VTN consistently boosts the features' representation power and consequently the networks' accuracy on fine-grained image recognition and instance-level image retrieval.
arXiv Detail & Related papers (2020-07-18T14:00:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.