Towards Semantic Communications: Deep Learning-Based Image Semantic
Coding
- URL: http://arxiv.org/abs/2208.04094v1
- Date: Mon, 8 Aug 2022 12:29:55 GMT
- Title: Towards Semantic Communications: Deep Learning-Based Image Semantic
Coding
- Authors: Danlan Huang, Feifei Gao, Xiaoming Tao, Qiyuan Du, and Jianhua Lu
- Abstract summary: We conceive the semantic communications for image data that is much more richer in semantics and bandwidth sensitive.
We propose an reinforcement learning based adaptive semantic coding (RL-ASC) approach that encodes images beyond pixel level.
Experimental results demonstrate that the proposed RL-ASC is noise robust and could reconstruct visually pleasant and semantic consistent image.
- Score: 42.453963827153856
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Semantic communications has received growing interest since it can remarkably
reduce the amount of data to be transmitted without missing critical
information. Most existing works explore the semantic encoding and transmission
for text and apply techniques in Natural Language Processing (NLP) to interpret
the meaning of the text. In this paper, we conceive the semantic communications
for image data that is much more richer in semantics and bandwidth sensitive.
We propose an reinforcement learning based adaptive semantic coding (RL-ASC)
approach that encodes images beyond pixel level. Firstly, we define the
semantic concept of image data that includes the category, spatial arrangement,
and visual feature as the representation unit, and propose a convolutional
semantic encoder to extract semantic concepts. Secondly, we propose the image
reconstruction criterion that evolves from the traditional pixel similarity to
semantic similarity and perceptual performance. Thirdly, we design a novel
RL-based semantic bit allocation model, whose reward is the increase in
rate-semantic-perceptual performance after encoding a certain semantic concept
with adaptive quantization level. Thus, the task-related information is
preserved and reconstructed properly while less important data is discarded.
Finally, we propose the Generative Adversarial Nets (GANs) based semantic
decoder that fuses both locally and globally features via an attention module.
Experimental results demonstrate that the proposed RL-ASC is noise robust and
could reconstruct visually pleasant and semantic consistent image, and saves
times of bit cost compared to standard codecs and other deep learning-based
image codecs.
Related papers
- Language-Oriented Semantic Latent Representation for Image Transmission [38.62941652189033]
New paradigm of semantic communication (SC) focuses on delivering meanings behind bits.
Recent advances in data-to-text models facilitate language-oriented SC.
We propose a novel SC framework that communicates both text and a compressed image embedding.
arXiv Detail & Related papers (2024-05-16T10:41:31Z) - Deep Image Semantic Communication Model for Artificial Intelligent
Internet of Things [16.505798124923224]
A novel deep image semantic communication model is proposed for the efficient image communication in AIoT.
At the transmitter side, a high-precision image semantic segmentation algorithm is proposed to extract the semantic information of the image.
At the receiver side, a semantic image restoration algorithm is proposed to convert the semantic image to a real scene image with detailed information.
arXiv Detail & Related papers (2023-11-06T07:43:42Z) - Edge Guided GANs with Multi-Scale Contrastive Learning for Semantic
Image Synthesis [139.2216271759332]
We propose a novel ECGAN for the challenging semantic image synthesis task.
The semantic labels do not provide detailed structural information, making it challenging to synthesize local details and structures.
The widely adopted CNN operations such as convolution, down-sampling, and normalization usually cause spatial resolution loss.
We propose a novel contrastive learning method, which aims to enforce pixel embeddings belonging to the same semantic class to generate more similar image content.
arXiv Detail & Related papers (2023-07-22T14:17:19Z) - Learning Semantic-Aware Knowledge Guidance for Low-Light Image
Enhancement [69.47143451986067]
Low-light image enhancement (LLIE) investigates how to improve illumination and produce normal-light images.
The majority of existing methods improve low-light images via a global and uniform manner, without taking into account the semantic information of different regions.
We propose a novel semantic-aware knowledge-guided framework that can assist a low-light enhancement model in learning rich and diverse priors encapsulated in a semantic segmentation model.
arXiv Detail & Related papers (2023-04-14T10:22:28Z) - Towards Better Text-Image Consistency in Text-to-Image Generation [15.735515302139335]
We develop a novel CLIP-based metric termed as Semantic Similarity Distance (SSD)
We further design the Parallel Deep Fusion Generative Adversarial Networks (PDF-GAN), which can fuse semantic information at different granularities.
Our PDF-GAN can lead to significantly better text-image consistency while maintaining decent image quality on the CUB and COCO datasets.
arXiv Detail & Related papers (2022-10-27T07:47:47Z) - Semantic Image Synthesis via Diffusion Models [159.4285444680301]
Denoising Diffusion Probabilistic Models (DDPMs) have achieved remarkable success in various image generation tasks.
Recent work on semantic image synthesis mainly follows the emphde facto Generative Adversarial Nets (GANs)
arXiv Detail & Related papers (2022-06-30T18:31:51Z) - Wireless Transmission of Images With The Assistance of Multi-level
Semantic Information [16.640928669609934]
MLSC-image is a multi-level semantic aware communication system for wireless image transmission.
We employ a pretrained image caption to capture the text semantics and a pretrained image segmentation model to obtain the segmentation semantics.
The numerical results validate the effectiveness and efficiency of the proposed semantic communication system.
arXiv Detail & Related papers (2022-02-08T16:25:26Z) - CRIS: CLIP-Driven Referring Image Segmentation [71.56466057776086]
We propose an end-to-end CLIP-Driven Referring Image framework (CRIS)
CRIS resorts to vision-language decoding and contrastive learning for achieving the text-to-pixel alignment.
Our proposed framework significantly outperforms the state-of-the-art performance without any post-processing.
arXiv Detail & Related papers (2021-11-30T07:29:08Z) - Edge Guided GANs with Contrastive Learning for Semantic Image Synthesis [194.1452124186117]
We propose a novel ECGAN for the challenging semantic image synthesis task.
Our ECGAN achieves significantly better results than state-of-the-art methods.
arXiv Detail & Related papers (2020-03-31T01:23:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.