Task-Oriented Semantic Communication in Large Multimodal Models-based Vehicle Networks
- URL: http://arxiv.org/abs/2505.02413v1
- Date: Mon, 05 May 2025 07:18:47 GMT
- Title: Task-Oriented Semantic Communication in Large Multimodal Models-based Vehicle Networks
- Authors: Baoxia Du, Hongyang Du, Dusit Niyato, Ruidong Li,
- Abstract summary: We investigate an LMM-based vehicle AI assistant using a Large Language and Vision Assistant (LLaVA)<n>To reduce computational demands and shorten response time, we optimize LLaVA's image slicing to selectively focus on areas of utmost interest to users.<n>We construct a Visual Question Answering (VQA) dataset for traffic scenarios to evaluate effectiveness.
- Score: 55.32199894495722
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Task-oriented semantic communication has emerged as a fundamental approach for enhancing performance in various communication scenarios. While recent advances in Generative Artificial Intelligence (GenAI), such as Large Language Models (LLMs), have been applied to semantic communication designs, the potential of Large Multimodal Models (LMMs) remains largely unexplored. In this paper, we investigate an LMM-based vehicle AI assistant using a Large Language and Vision Assistant (LLaVA) and propose a task-oriented semantic communication framework to facilitate efficient interaction between users and cloud servers. To reduce computational demands and shorten response time, we optimize LLaVA's image slicing to selectively focus on areas of utmost interest to users. Additionally, we assess the importance of image patches by combining objective and subjective user attention, adjusting energy usage for transmitting semantic information. This strategy optimizes resource utilization, ensuring precise transmission of critical information. We construct a Visual Question Answering (VQA) dataset for traffic scenarios to evaluate effectiveness. Experimental results show that our semantic communication framework significantly increases accuracy in answering questions under the same channel conditions, performing particularly well in environments with poor Signal-to-Noise Ratios (SNR). Accuracy can be improved by 13.4% at an SNR of 12dB and 33.1% at 10dB, respectively.
Related papers
- Diffusion-based Task-oriented Semantic Communications with Model Inversion Attack [6.115539523178243]
Task-oriented semantic communication is a promising neural network-based system design for 6G networks.<n>We propose a diffusion-based semantic communication framework, named DiffSem, to optimize semantic information reconstruction.<n>Our results show that DiffSem improves the classification accuracy by 10.03%, and maintain stable performance under dynamic channels.
arXiv Detail & Related papers (2025-06-24T05:21:27Z) - Underlying Semantic Diffusion for Effective and Efficient In-Context Learning [113.4003355229632]
Underlying Semantic Diffusion (US-Diffusion) is an enhanced diffusion model that boosts underlying semantics learning, computational efficiency, and in-context learning capabilities.<n>We present a Feedback-Aided Learning (FAL) framework, which leverages feedback signals to guide the model in capturing semantic details.<n>We also propose a plug-and-play Efficient Sampling Strategy (ESS) for dense sampling at time steps with high-noise levels.
arXiv Detail & Related papers (2025-03-06T03:06:22Z) - Joint Adaptive OFDM and Reinforcement Learning Design for Autonomous Vehicles: Leveraging Age of Updates [2.607046313483251]
Millimeter wave (mmWave)-based frequency-division multiplexing (OFDM) stands out as a suitable alternative for high-resolution sensing and high-speed data transmission.<n>In this work, we consider an autonomous vehicle network where an AV utilizes its queue state information (QSI) and channel state information (CSI) in conjunction with reinforcement learning techniques to manage communication and sensing.
arXiv Detail & Related papers (2024-12-24T15:32:58Z) - Goal-Oriented Semantic Communication for Wireless Visual Question Answering [68.75814200517854]
We propose a goal-oriented semantic communication (GSC) framework to improve Visual Question Answering (VQA) performance.<n>We propose a bounding box (BBox)-based image semantic extraction and ranking approach to prioritize the semantic information based on the goal of questions.<n> Experimental results demonstrate that our GSC framework improves answering accuracy by up to 49% under AWGN channels and 59% under Rayleigh channels.
arXiv Detail & Related papers (2024-11-03T12:01:18Z) - Semantic Communication for Cooperative Perception using HARQ [51.148203799109304]
We leverage an importance map to distill critical semantic information, introducing a cooperative perception semantic communication framework.
To counter the challenges posed by time-varying multipath fading, our approach incorporates the use of frequency-division multiplexing (OFDM) along with channel estimation and equalization strategies.
We introduce a novel semantic error detection method that is integrated with our semantic communication framework in the spirit of hybrid automatic repeated request (HARQ)
arXiv Detail & Related papers (2024-08-29T08:53:26Z) - Adaptive Resource Allocation for Semantic Communication Networks [34.189531352110386]
This paper investigates the quality of service for semantic communication networks, including the semantic quantization efficiency (SQE) and transmission latency.
A problem maximizing the overall effective SC-QoS is formulated by jointly the transmit beamforming the base station, the bits semantic representation the subchannel assignment, and the semantic resource allocation.
Our design can effectively combat semantic noise and achieve superior performance in wireless communications compared to several benchmark schemes.
arXiv Detail & Related papers (2023-12-02T09:12:12Z) - Semantic Communication Enabling Robust Edge Intelligence for
Time-Critical IoT Applications [87.05763097471487]
This paper aims to design robust Edge Intelligence using semantic communication for time-critical IoT applications.
We analyze the effect of image DCT coefficients on inference accuracy and propose the channel-agnostic effectiveness encoding for offloading.
arXiv Detail & Related papers (2022-11-24T20:13:17Z) - Performance Optimization for Semantic Communications: An Attention-based
Reinforcement Learning Approach [187.4094332217186]
A semantic communication framework is proposed for textual data transmission.
A metric of semantic similarity (MSS) that jointly captures the semantic accuracy and completeness of the recovered text is proposed.
arXiv Detail & Related papers (2022-08-17T11:39:16Z) - Towards Human-Agent Communication via the Information Bottleneck
Principle [19.121541894577298]
We study how trading off these three factors -- utility, informativeness, and complexity -- shapes emergent communication.
We propose Vector-Quantized Variational Information Bottleneck (VQ-VIB), a method for training neural agents to compress inputs into discrete signals embedded in a continuous space.
arXiv Detail & Related papers (2022-06-30T20:10:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.