Image Generative Semantic Communication with Multi-Modal Similarity Estimation for Resource-Limited Networks
- URL: http://arxiv.org/abs/2404.11280v2
- Date: Sat, 3 Aug 2024 04:49:55 GMT
- Title: Image Generative Semantic Communication with Multi-Modal Similarity Estimation for Resource-Limited Networks
- Authors: Eri Hosonuma, Taku Yamazaki, Takumi Miyoshi, Akihito Taya, Yuuki Nishiyama, Kaoru Sezaki,
- Abstract summary: This study proposes a multi-modal image transmission method that leverages various types of semantic information for efficient semantic communication.
The proposed method extracts multi-modal semantic information from an original image and transmits only that to a receiver.
The receiver generates multiple images using an image-generation model and selects an output image based on semantic similarity.
- Score: 2.2997117992292764
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To reduce network traffic and support environments with limited resources, a method for transmitting images with minimal transmission data is required. Several machine learning-based image compression methods, which compress the data size of images while maintaining their features, have been proposed. However, in certain situations, reconstructing only the semantic information of images at the receiver end may be sufficient. To realize this concept, semantic-information-based communication, called semantic communication, has been proposed, along with an image transmission method using semantic communication. This method transmits only the semantic information of an image, and the receiver reconstructs it using an image-generation model. This method utilizes a single type of semantic information for image reconstruction, but reconstructing images similar to the original image using only this information is challenging. This study proposes a multi-modal image transmission method that leverages various types of semantic information for efficient semantic communication. The proposed method extracts multi-modal semantic information from an original image and transmits only that to a receiver. Subsequently, the receiver generates multiple images using an image-generation model and selects an output image based on semantic similarity. The receiver must select the result based only on the received features; however, evaluating the similarity using conventional metrics is challenging. Therefore, this study explores new metrics to evaluate the similarity between semantic features of images and proposes two scoring procedures for evaluating semantic similarity between images based on multiple semantic features. The results indicate that the proposed procedures can compare semantic similarities, such as position and composition, between the semantic features of the original and generated images.
Related papers
- Semantic Similarity Score for Measuring Visual Similarity at Semantic Level [5.867765921443141]
We propose a semantic evaluation metric -- SeSS (Semantic Similarity Score) based on Scene Graph Generation and graph matching.
The metric can measure the semantic-level differences in semantic-level information of images and can be used for evaluation in visual semantic communication systems.
arXiv Detail & Related papers (2024-06-06T08:51:26Z) - Conditional Diffusion on Web-Scale Image Pairs leads to Diverse Image Variations [32.892042877725125]
Current image variation techniques involve adapting a text-to-image model to reconstruct an input image conditioned on the same image.
We show that a diffusion model trained to reconstruct an input image from frozen embeddings, can reconstruct the image with minor variations.
We propose a new pretraining strategy to generate image variations using a large collection of image pairs.
arXiv Detail & Related papers (2024-05-23T17:58:03Z) - Deep Image Semantic Communication Model for Artificial Intelligent
Internet of Things [16.505798124923224]
A novel deep image semantic communication model is proposed for the efficient image communication in AIoT.
At the transmitter side, a high-precision image semantic segmentation algorithm is proposed to extract the semantic information of the image.
At the receiver side, a semantic image restoration algorithm is proposed to convert the semantic image to a real scene image with detailed information.
arXiv Detail & Related papers (2023-11-06T07:43:42Z) - Edge Guided GANs with Multi-Scale Contrastive Learning for Semantic
Image Synthesis [139.2216271759332]
We propose a novel ECGAN for the challenging semantic image synthesis task.
The semantic labels do not provide detailed structural information, making it challenging to synthesize local details and structures.
The widely adopted CNN operations such as convolution, down-sampling, and normalization usually cause spatial resolution loss.
We propose a novel contrastive learning method, which aims to enforce pixel embeddings belonging to the same semantic class to generate more similar image content.
arXiv Detail & Related papers (2023-07-22T14:17:19Z) - Memory-Driven Text-to-Image Generation [126.58244124144827]
We introduce a memory-driven semi-parametric approach to text-to-image generation.
Non-parametric component is a memory bank of image features constructed from a training set of images.
parametric component is a generative adversarial network.
arXiv Detail & Related papers (2022-08-15T06:32:57Z) - Towards Semantic Communications: Deep Learning-Based Image Semantic
Coding [42.453963827153856]
We conceive the semantic communications for image data that is much more richer in semantics and bandwidth sensitive.
We propose an reinforcement learning based adaptive semantic coding (RL-ASC) approach that encodes images beyond pixel level.
Experimental results demonstrate that the proposed RL-ASC is noise robust and could reconstruct visually pleasant and semantic consistent image.
arXiv Detail & Related papers (2022-08-08T12:29:55Z) - Semantic Image Synthesis via Diffusion Models [159.4285444680301]
Denoising Diffusion Probabilistic Models (DDPMs) have achieved remarkable success in various image generation tasks.
Recent work on semantic image synthesis mainly follows the emphde facto Generative Adversarial Nets (GANs)
arXiv Detail & Related papers (2022-06-30T18:31:51Z) - Wireless Transmission of Images With The Assistance of Multi-level
Semantic Information [16.640928669609934]
MLSC-image is a multi-level semantic aware communication system for wireless image transmission.
We employ a pretrained image caption to capture the text semantics and a pretrained image segmentation model to obtain the segmentation semantics.
The numerical results validate the effectiveness and efficiency of the proposed semantic communication system.
arXiv Detail & Related papers (2022-02-08T16:25:26Z) - Diverse Semantic Image Synthesis via Probability Distribution Modeling [103.88931623488088]
We propose a novel diverse semantic image synthesis framework.
Our method can achieve superior diversity and comparable quality compared to state-of-the-art methods.
arXiv Detail & Related papers (2021-03-11T18:59:25Z) - Cross-domain Correspondence Learning for Exemplar-based Image
Translation [59.35767271091425]
We present a framework for exemplar-based image translation, which synthesizes a photo-realistic image from the input in a distinct domain.
The output has the style (e.g., color, texture) in consistency with the semantically corresponding objects in the exemplar.
We show that our method is superior to state-of-the-art methods in terms of image quality significantly.
arXiv Detail & Related papers (2020-04-12T09:10:57Z) - Geometrically Mappable Image Features [85.81073893916414]
Vision-based localization of an agent in a map is an important problem in robotics and computer vision.
We propose a method that learns image features targeted for image-retrieval-based localization.
arXiv Detail & Related papers (2020-03-21T15:36:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.