MOC-RVQ: Multilevel Codebook-assisted Digital Generative Semantic
Communication
- URL: http://arxiv.org/abs/2401.01272v1
- Date: Tue, 2 Jan 2024 16:17:43 GMT
- Title: MOC-RVQ: Multilevel Codebook-assisted Digital Generative Semantic
Communication
- Authors: Yingbin Zhou, Yaping Sun, Guanying Chen, Xiaodong Xu, Hao Chen,
Binhong Huang, Shuguang Cui, Ping Zhang
- Abstract summary: We propose a multilevel generative semantic communication system with a two-stage training framework.
In the first stage, we train a high-quality codebook, using a multi-head octonary codebook (MOC) to compress the index range.
In the second stage, a noise reduction block (NRB) based on Swin Transformer is introduced, coupled with the multilevel codebook.
- Score: 45.038606603738586
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Vector quantization-based image semantic communication systems have
successfully boosted transmission efficiency, but face a challenge with
conflicting requirements between codebook design and digital constellation
modulation. Traditional codebooks need a wide index range, while modulation
favors few discrete states. To address this, we propose a multilevel generative
semantic communication system with a two-stage training framework. In the first
stage, we train a high-quality codebook, using a multi-head octonary codebook
(MOC) to compress the index range. We also integrate a residual vector
quantization (RVQ) mechanism for effective multilevel communication. In the
second stage, a noise reduction block (NRB) based on Swin Transformer is
introduced, coupled with the multilevel codebook from the first stage, serving
as a high-quality semantic knowledge base (SKB) for generative feature
restoration. Experimental results highlight MOC-RVQ's superior performance over
methods like BPG or JPEG, even without channel error correction coding.
Related papers
- Visual Language Model based Cross-modal Semantic Communication Systems [42.321208020228894]
We propose a novel Vision-Language Model-based Cross-modal Semantic Communication system.
The VLM-CSC comprises three novel components.
The experimental simulations validate the effectiveness, adaptability, and robustness of the CSC system.
arXiv Detail & Related papers (2024-05-06T08:59:16Z) - Analog information decoding of bosonic quantum LDPC codes [3.34006871348377]
We propose novel decoding methods that explicitly exploit the syndrome information obtained from a bosonic qubit readout.
Our results lay the foundation for general decoding algorithms using analog information and demonstrate promising results in the direction of fault-tolerant quantum computation.
arXiv Detail & Related papers (2023-11-02T15:41:03Z) - Towards Accurate Image Coding: Improved Autoregressive Image Generation
with Dynamic Vector Quantization [73.52943587514386]
Existing vector quantization (VQ) based autoregressive models follow a two-stage generation paradigm.
We propose a novel two-stage framework: (1) Dynamic-Quantization VAE (DQ-VAE) which encodes image regions into variable-length codes based their information densities for accurate representation.
arXiv Detail & Related papers (2023-05-19T14:56:05Z) - A Multi-Stage Multi-Codebook VQ-VAE Approach to High-Performance Neural
TTS [52.51848317549301]
We propose a Multi-Stage, Multi-Codebook (MSMC) approach to high-performance neural TTS synthesis.
A vector-quantized, variational autoencoder (VQ-VAE) based feature analyzer is used to encode Mel spectrograms of speech training data.
In synthesis, the neural vocoder converts the predicted MSMCRs into final speech waveforms.
arXiv Detail & Related papers (2022-09-22T09:43:17Z) - Learning Representations for CSI Adaptive Quantization and Feedback [51.14360605938647]
We propose an efficient method for adaptive quantization and feedback in frequency division duplexing systems.
Existing works mainly focus on the implementation of autoencoder (AE) neural networks for CSI compression.
We recommend two different methods: one based on a post training quantization and the second one in which the codebook is found during the training of the AE.
arXiv Detail & Related papers (2022-07-13T08:52:13Z) - Tensor Learning-based Precoder Codebooks for FD-MIMO Systems [47.562560779723334]
This paper develops an efficient procedure for designing low-complexity codebooks for precoding in a full-dimension (FD) multiple-input multiple-output (MIMO) system.
We utilize a model-free data-driven approach with foundations in machine learning to generate codebooks that adapt to the surrounding propagation conditions.
arXiv Detail & Related papers (2021-06-21T19:18:39Z) - Quantum repeaters based on concatenated bosonic and discrete-variable
quantum codes [7.022007590511487]
We propose to encode transmitted qubits in a bosond code consisting of two levels.
On the first level we use a continuous-variable GKP code encoding the qubit in a single bosonic mode.
On the second level we use a small discrete-variable code.
arXiv Detail & Related papers (2020-11-30T18:14:39Z) - Optimal Gradient Quantization Condition for Communication-Efficient
Distributed Training [99.42912552638168]
Communication of gradients is costly for training deep neural networks with multiple devices in computer vision applications.
In this work, we deduce the optimal condition of both the binary and multi-level gradient quantization for textbfANY gradient distribution.
Based on the optimal condition, we develop two novel quantization schemes: biased BinGrad and unbiased ORQ for binary and multi-level gradient quantization respectively.
arXiv Detail & Related papers (2020-02-25T18:28:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.