Generative Semantic Communication for Text-to-Speech Synthesis
- URL: http://arxiv.org/abs/2410.03459v1
- Date: Fri, 4 Oct 2024 14:18:31 GMT
- Title: Generative Semantic Communication for Text-to-Speech Synthesis
- Authors: Jiahao Zheng, Jinke Ren, Peng Xu, Zhihao Yuan, Jie Xu, Fangxin Wang, Gui Gui, Shuguang Cui,
- Abstract summary: This paper develops a novel generative semantic communication framework for text-to-speech synthesis.
We employ a transformer encoder and a diffusion model to achieve efficient semantic coding without introducing significant communication overhead.
- Score: 39.8799066368712
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semantic communication is a promising technology to improve communication efficiency by transmitting only the semantic information of the source data. However, traditional semantic communication methods primarily focus on data reconstruction tasks, which may not be efficient for emerging generative tasks such as text-to-speech (TTS) synthesis. To address this limitation, this paper develops a novel generative semantic communication framework for TTS synthesis, leveraging generative artificial intelligence technologies. Firstly, we utilize a pre-trained large speech model called WavLM and the residual vector quantization method to construct two semantic knowledge bases (KBs) at the transmitter and receiver, respectively. The KB at the transmitter enables effective semantic extraction, while the KB at the receiver facilitates lifelike speech synthesis. Then, we employ a transformer encoder and a diffusion model to achieve efficient semantic coding without introducing significant communication overhead. Finally, numerical results demonstrate that our framework achieves much higher fidelity for the generated speech than four baselines, in both cases with additive white Gaussian noise channel and Rayleigh fading channel.
Related papers
- Large Generative Model-assisted Talking-face Semantic Communication System [55.42631520122753]
This study introduces a Large Generative Model-assisted Talking-face Semantic Communication (LGM-TSC) system.
Generative Semantic Extractor (GSE) at the transmitter converts semantically sparse talking-face videos into texts with high information density.
Private Knowledge Base (KB) based on the Large Language Model (LLM) for semantic disambiguation and correction.
Generative Semantic Reconstructor (GSR) that utilizes BERT-VITS2 and SadTalker models to transform text back into a high-QoE talking-face video.
arXiv Detail & Related papers (2024-11-06T12:45:46Z) - Agent-driven Generative Semantic Communication with Cross-Modality and Prediction [57.335922373309074]
We propose a novel agent-driven generative semantic communication framework based on reinforcement learning.
In this work, we develop an agent-assisted semantic encoder with cross-modality capability, which can track the semantic changes, channel condition, to perform adaptive semantic extraction and sampling.
The effectiveness of the designed models has been verified using the UA-DETRAC dataset, demonstrating the performance gains of the overall A-GSC framework.
arXiv Detail & Related papers (2024-04-10T13:24:27Z) - Knowledge Base Enabled Semantic Communication: A Generative Perspective [47.49283348253937]
This article takes a crack at exploiting semantic knowledge base (KB) to usher in a new era of generative semantic communication.
Via semantic KB, source messages can be characterized in low-dimensional subspaces without compromising their desired meanings.
arXiv Detail & Related papers (2023-11-21T08:54:49Z) - Language-Oriented Communication with Semantic Coding and Knowledge
Distillation for Text-to-Image Generation [53.97155730116369]
We put forward a novel framework of language-oriented semantic communication (LSC)
In LSC, machines communicate using human language messages that can be interpreted and manipulated via natural language processing (NLP) techniques for SC efficiency.
We introduce three innovative algorithms: 1) semantic source coding (SSC), which compresses a text prompt into its key head words capturing the prompt's syntactic essence; 2) semantic channel coding ( SCC), that improves robustness against errors by substituting head words with their lenghthier synonyms; and 3) semantic knowledge distillation (SKD), that produces listener-customized prompts via in-context learning the listener's
arXiv Detail & Related papers (2023-09-20T08:19:05Z) - Transformer-based Joint Source Channel Coding for Textual Semantic
Communication [23.431590618978948]
Space-Air-Ground-Sea integrated network calls for more robust and secure transmission techniques against jamming.
We propose a textual semantic transmission framework for robust transmission, which utilizes the advanced natural language processing techniques to model and encode sentences.
arXiv Detail & Related papers (2023-07-23T08:42:05Z) - Knowledge Enhanced Semantic Communication Receiver [7.171974845607281]
We propose a knowledge enhanced semantic communication framework in which the receiver can more actively utilize the facts in the knowledge base for semantic reasoning and decoding.
Specifically, we design a transformer-based knowledge extractor to find relevant factual triples for the received noisy signal.
Extensive simulation results on the WebNLG dataset demonstrate that the proposed receiver yields superior performance on top of the knowledge graph enhanced decoding.
arXiv Detail & Related papers (2023-02-13T01:49:51Z) - Semantic-Native Communication: A Simplicial Complex Perspective [50.099494681671224]
We study semantic communication from a topological space perspective.
A transmitter first maps its data into a $k$-order simplicial complex and then learns its high-order correlations.
The receiver decodes the structure and infers the missing or distorted data.
arXiv Detail & Related papers (2022-10-30T22:33:44Z) - Communication Beyond Transmitting Bits: Semantics-Guided Source and
Channel Coding [7.080957878208516]
"Semantic communications" offers promising research direction.
Injecting semantic guidance into the coded transmission design to achieve semantics-aware communications shows great potential for breakthrough in effectiveness and reliability.
This article sheds light on semantics-guided source and channel coding as a transmission paradigm of semantic communications.
arXiv Detail & Related papers (2022-08-04T06:12:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.