Related papers: Semantic Communication based on Large Language Model for Underwater Image Transmission

Semantic Communication based on Large Language Model for Underwater Image Transmission

URL: http://arxiv.org/abs/2408.12616v2
Date: Mon, 26 Aug 2024 03:47:06 GMT
Title: Semantic Communication based on Large Language Model for Underwater Image Transmission
Authors: Weilong Chen, Wenxuan Xu, Haoran Chen, Xinran Zhang, Zhijin Qin, Yanru Zhang, Zhu Han,
Abstract summary: Traditional underwater communication faces limitations like low bandwidth, high latency, and susceptibility to noise. We propose a novel Semantic Communication framework based on Large Language Models (LLMs) Our framework reduces the overall data size to 0.8% of the original.
Score: 36.56805696235768
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Underwater communication is essential for environmental monitoring, marine biology research, and underwater exploration. Traditional underwater communication faces limitations like low bandwidth, high latency, and susceptibility to noise, while semantic communication (SC) offers a promising solution by focusing on the exchange of semantics rather than symbols or bits. However, SC encounters challenges in underwater environments, including semantic information mismatch and difficulties in accurately identifying and transmitting critical information that aligns with the diverse requirements of underwater applications. To address these challenges, we propose a novel Semantic Communication (SC) framework based on Large Language Models (LLMs). Our framework leverages visual LLMs to perform semantic compression and prioritization of underwater image data according to the query from users. By identifying and encoding key semantic elements within the images, the system selectively transmits high-priority information while applying higher compression rates to less critical regions. On the receiver side, an LLM-based recovery mechanism, along with Global Vision ControlNet and Key Region ControlNet networks, aids in reconstructing the images, thereby enhancing communication efficiency and robustness. Our framework reduces the overall data size to 0.8\% of the original. Experimental results demonstrate that our method significantly outperforms existing approaches, ensuring high-quality, semantically accurate image reconstruction.

Related papers

Task-Oriented Semantic Communication in Large Multimodal Models-based Vehicle Networks [55.32199894495722]
We investigate an LMM-based vehicle AI assistant using a Large Language and Vision Assistant (LLaVA)<n>To reduce computational demands and shorten response time, we optimize LLaVA's image slicing to selectively focus on areas of utmost interest to users.<n>We construct a Visual Question Answering (VQA) dataset for traffic scenarios to evaluate effectiveness.
arXiv Detail & Related papers (2025-05-05T07:18:47Z)
PIGUIQA: A Physical Imaging Guided Perceptual Framework for Underwater Image Quality Assessment [59.9103803198087]
We propose a Physical Imaging Guided perceptual framework for Underwater Image Quality Assessment (UIQA) By leveraging underwater radiative transfer theory, we integrate physics-based imaging estimations to establish quantitative metrics for these distortions. The proposed model accurately predicts image quality scores and achieves state-of-the-art performance.
arXiv Detail & Related papers (2024-12-20T03:31:45Z)
Vision Transformer-based Semantic Communications With Importance-Aware Quantization [13.328970689723096]
This paper presents a vision transformer (ViT)-based semantic communication system with importance-aware quantization (IAQ) for wireless image transmission. We show that our IAQ framework outperforms conventional image compression methods in both error-free and realistic communication scenarios.
arXiv Detail & Related papers (2024-12-08T19:24:47Z)
HoliSDiP: Image Super-Resolution via Holistic Semantics and Diffusion Prior [62.04939047885834]
We present HoliSDiP, a framework that leverages semantic segmentation to provide both precise textual and spatial guidance for Real-ISR. Our method employs semantic labels as concise text prompts while introducing dense semantic guidance through segmentation masks and our proposed spatial-CLIP Map.
arXiv Detail & Related papers (2024-11-27T15:22:44Z)
Advanced Underwater Image Quality Enhancement via Hybrid Super-Resolution Convolutional Neural Networks and Multi-Scale Retinex-Based Defogging Techniques [0.0]
The research conducts extensive experiments on real-world underwater datasets to further illustrate the efficacy of the suggested approach. In real-time underwater applications like marine exploration, underwater robotics, and autonomous underwater vehicles, the combination of deep learning and conventional image processing techniques offers a computationally efficient framework with superior results.
arXiv Detail & Related papers (2024-10-18T08:40:26Z)
Trustworthy Image Semantic Communication with GenAI: Explainablity, Controllability, and Efficiency [59.15544887307901]
Image semantic communication (ISC) has garnered significant attention for its potential to achieve high efficiency in visual content transmission. Existing ISC systems based on joint source-channel coding face challenges in interpretability, operability, and compatibility. We propose a novel trustworthy ISC framework that employs Generative Artificial Intelligence (GenAI) for multiple downstream inference tasks.
arXiv Detail & Related papers (2024-08-07T14:32:36Z)
WaterMamba: Visual State Space Model for Underwater Image Enhancement [17.172623370407155]
Underwater imaging often suffers from low quality due to factors affecting light propagation and absorption in water. To improve image quality, some underwater image enhancement (UIE) methods based on convolutional neural networks (CNN) and Transformer have been proposed. Considering computational complexity and severe underwater image degradation, a state space model (SSM) with linear computational complexity for UIE, named WaterMamba, is proposed.
arXiv Detail & Related papers (2024-05-14T08:26:29Z)
Transformer-Aided Semantic Communications [28.63893944806149]
We employ vision transformers specifically for the purpose of compression and compact representation of the input image. Through the use of the attention mechanism inherent in transformers, we create an attention mask. We evaluate the effectiveness of our proposed framework using the TinyImageNet dataset.
arXiv Detail & Related papers (2024-05-02T17:50:53Z)
DGNet: Dynamic Gradient-Guided Network for Water-Related Optics Image Enhancement [77.0360085530701]
Underwater image enhancement (UIE) is a challenging task due to the complex degradation caused by underwater environments. Previous methods often idealize the degradation process, and neglect the impact of medium noise and object motion on the distribution of image features. Our approach utilizes predicted images to dynamically update pseudo-labels, adding a dynamic gradient to optimize the network's gradient space.
arXiv Detail & Related papers (2023-12-12T06:07:21Z)
PUGAN: Physical Model-Guided Underwater Image Enhancement Using GAN with Dual-Discriminators [120.06891448820447]
How to obtain clear and visually pleasant images has become a common concern of people. The task of underwater image enhancement (UIE) has also emerged as the times require. In this paper, we propose a physical model-guided GAN model for UIE, referred to as PUGAN. Our PUGAN outperforms state-of-the-art methods in both qualitative and quantitative metrics.
arXiv Detail & Related papers (2023-06-15T07:41:12Z)
Semantic-aware Texture-Structure Feature Collaboration for Underwater Image Enhancement [58.075720488942125]
Underwater image enhancement has become an attractive topic as a significant technology in marine engineering and aquatic robotics. We develop an efficient and compact enhancement network in collaboration with a high-level semantic-aware pretrained model. We also apply the proposed algorithm to the underwater salient object detection task to reveal the favorable semantic-aware ability for high-level vision tasks.
arXiv Detail & Related papers (2022-11-19T07:50:34Z)
Towards Semantic Communications: Deep Learning-Based Image Semantic Coding [42.453963827153856]
We conceive the semantic communications for image data that is much more richer in semantics and bandwidth sensitive. We propose an reinforcement learning based adaptive semantic coding (RL-ASC) approach that encodes images beyond pixel level. Experimental results demonstrate that the proposed RL-ASC is noise robust and could reconstruct visually pleasant and semantic consistent image.
arXiv Detail & Related papers (2022-08-08T12:29:55Z)
Perceptual Learned Source-Channel Coding for High-Fidelity Image Semantic Transmission [7.692038874196345]
In this paper, we introduce adversarial losses to optimize deep J SCC. Our new deep J SCC architecture combines encoder, wireless channel, decoder/generator, and discriminator. A user study confirms that achieving the perceptually similar end-to-end image transmission quality, the proposed method can save about 50% wireless channel bandwidth cost.
arXiv Detail & Related papers (2022-05-26T03:05:13Z)
Dense Attention Fluid Network for Salient Object Detection in Optical Remote Sensing Images [193.77450545067967]
We propose an end-to-end Dense Attention Fluid Network (DAFNet) for salient object detection in optical remote sensing images (RSIs) A Global Context-aware Attention (GCA) module is proposed to adaptively capture long-range semantic context relationships. We construct a new and challenging optical RSI dataset for SOD that contains 2,000 images with pixel-wise saliency annotations.
arXiv Detail & Related papers (2020-11-26T06:14:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.