Language-Oriented Communication with Semantic Coding and Knowledge
Distillation for Text-to-Image Generation
- URL: http://arxiv.org/abs/2309.11127v1
- Date: Wed, 20 Sep 2023 08:19:05 GMT
- Title: Language-Oriented Communication with Semantic Coding and Knowledge
Distillation for Text-to-Image Generation
- Authors: Hyelin Nam, Jihong Park, Jinho Choi, Mehdi Bennis, and Seong-Lyun Kim
- Abstract summary: We put forward a novel framework of language-oriented semantic communication (LSC)
In LSC, machines communicate using human language messages that can be interpreted and manipulated via natural language processing (NLP) techniques for SC efficiency.
We introduce three innovative algorithms: 1) semantic source coding (SSC), which compresses a text prompt into its key head words capturing the prompt's syntactic essence; 2) semantic channel coding ( SCC), that improves robustness against errors by substituting head words with their lenghthier synonyms; and 3) semantic knowledge distillation (SKD), that produces listener-customized prompts via in-context learning the listener's
- Score: 53.97155730116369
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: By integrating recent advances in large language models (LLMs) and generative
models into the emerging semantic communication (SC) paradigm, in this article
we put forward to a novel framework of language-oriented semantic communication
(LSC). In LSC, machines communicate using human language messages that can be
interpreted and manipulated via natural language processing (NLP) techniques
for SC efficiency. To demonstrate LSC's potential, we introduce three
innovative algorithms: 1) semantic source coding (SSC) which compresses a text
prompt into its key head words capturing the prompt's syntactic essence while
maintaining their appearance order to keep the prompt's context; 2) semantic
channel coding (SCC) that improves robustness against errors by substituting
head words with their lenghthier synonyms; and 3) semantic knowledge
distillation (SKD) that produces listener-customized prompts via in-context
learning the listener's language style. In a communication task for progressive
text-to-image generation, the proposed methods achieve higher perceptual
similarities with fewer transmissions while enhancing robustness in noisy
communication channels.
Related papers
- Visual Language Model based Cross-modal Semantic Communication Systems [42.321208020228894]
We propose a novel Vision-Language Model-based Cross-modal Semantic Communication system.
The VLM-CSC comprises three novel components.
The experimental simulations validate the effectiveness, adaptability, and robustness of the CSC system.
arXiv Detail & Related papers (2024-05-06T08:59:16Z) - Bridge to Non-Barrier Communication: Gloss-Prompted Fine-grained Cued Speech Gesture Generation with Diffusion Model [11.160802635050866]
Cued Speech (CS) is an advanced visual phonetic encoding system that integrates lip reading with hand codings.
Existing CS generation methods are fragile and prone to poor performance due to template-based statistical models.
We propose a novel Gloss-prompted Diffusion-based CS Gesture generation framework (called GlossDiff)
arXiv Detail & Related papers (2024-04-30T05:54:40Z) - Leveraging Language ID to Calculate Intermediate CTC Loss for Enhanced
Code-Switching Speech Recognition [5.3545957730615905]
We introduce language identification information into the middle layer of the ASR model's encoder.
We aim to generate acoustic features that imply language distinctions in a more implicit way, reducing the model's confusion when dealing with language switching.
arXiv Detail & Related papers (2023-12-15T07:46:35Z) - Generative AI-aided Joint Training-free Secure Semantic Communications
via Multi-modal Prompts [89.04751776308656]
This paper proposes a GAI-aided SemCom system with multi-model prompts for accurate content decoding.
In response to security concerns, we introduce the application of covert communications aided by a friendly jammer.
arXiv Detail & Related papers (2023-09-05T23:24:56Z) - On decoder-only architecture for speech-to-text and large language model
integration [59.49886892602309]
Speech-LLaMA is a novel approach that effectively incorporates acoustic information into text-based large language models.
We conduct experiments on multilingual speech-to-text translation tasks and demonstrate a significant improvement over strong baselines.
arXiv Detail & Related papers (2023-07-08T06:47:58Z) - Causal Semantic Communication for Digital Twins: A Generalizable
Imitation Learning Approach [74.25870052841226]
A digital twin (DT) leverages a virtual representation of the physical world, along with communication (e.g., 6G), computing, and artificial intelligence (AI) technologies to enable many connected intelligence services.
Wireless systems can exploit the paradigm of semantic communication (SC) for facilitating informed decision-making under strict communication constraints.
A novel framework called causal semantic communication (CSC) is proposed for DT-based wireless systems.
arXiv Detail & Related papers (2023-04-25T00:15:00Z) - A Vector Quantized Approach for Text to Speech Synthesis on Real-World
Spontaneous Speech [94.64927912924087]
We train TTS systems using real-world speech from YouTube and podcasts.
Recent Text-to-Speech architecture is designed for multiple code generation and monotonic alignment.
We show thatRecent Text-to-Speech architecture outperforms existing TTS systems in several objective and subjective measures.
arXiv Detail & Related papers (2023-02-08T17:34:32Z) - Neuro-Symbolic Causal Reasoning Meets Signaling Game for Emergent
Semantic Communications [71.63189900803623]
A novel emergent SC system framework is proposed and is composed of a signaling game for emergent language design and a neuro-symbolic (NeSy) artificial intelligence (AI) approach for causal reasoning.
The ESC system is designed to enhance the novel metrics of semantic information, reliability, distortion and similarity.
arXiv Detail & Related papers (2022-10-21T15:33:37Z) - Improving Code-switching Language Modeling with Artificially Generated
Texts using Cycle-consistent Adversarial Networks [41.88097793717185]
We investigate methods to augment Code-switching training text data by artificially generating them.
We propose a cycle-consistent adversarial networks based framework to transfer monolingual text into Code-switching text.
arXiv Detail & Related papers (2021-12-12T21:27:32Z) - Semantics-Native Communication with Contextual Reasoning [46.2484183677342]
We propose a novel model of System 1 semantics-native communication (SNC) for generic tasks.
We infuse contextual reasoning into SNC such that the speaker locally and iteratively self-communicates with a virtual agent built on the listener's unique way of its semantics.
It is also shown that System 2 SNC significantly reduces the SR length without compromising communication reliability.
arXiv Detail & Related papers (2021-08-12T12:04:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.