Related papers: M2BeamLLM: Multimodal Sensing-empowered mmWave Beam Prediction with Large Language Models

M2BeamLLM: Multimodal Sensing-empowered mmWave Beam Prediction with Large Language Models

URL: http://arxiv.org/abs/2506.14532v1
Date: Tue, 17 Jun 2025 13:58:36 GMT
Title: M2BeamLLM: Multimodal Sensing-empowered mmWave Beam Prediction with Large Language Models
Authors: Can Zheng, Jiguang He, Chung G. Kang, Guofa Cai, Zitong Yu, Merouane Debbah,
Abstract summary: M2BeamLLM integrates multi-modal sensor data, including images, radar, LiDAR, and GPS.<n>Its prediction performance consistently improves with increased diversity in sensing modalities.<n>Our study provides an efficient and intelligent beam prediction solution for vehicle-to-infrastructure (V2I) mmWave communication systems.
Score: 22.009889991924453
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper introduces a novel neural network framework called M2BeamLLM for beam prediction in millimeter-wave (mmWave) massive multi-input multi-output (mMIMO) communication systems. M2BeamLLM integrates multi-modal sensor data, including images, radar, LiDAR, and GPS, leveraging the powerful reasoning capabilities of large language models (LLMs) such as GPT-2 for beam prediction. By combining sensing data encoding, multimodal alignment and fusion, and supervised fine-tuning (SFT), M2BeamLLM achieves significantly higher beam prediction accuracy and robustness, demonstrably outperforming traditional deep learning (DL) models in both standard and few-shot scenarios. Furthermore, its prediction performance consistently improves with increased diversity in sensing modalities. Our study provides an efficient and intelligent beam prediction solution for vehicle-to-infrastructure (V2I) mmWave communication systems.

Related papers

Large Language Model-Driven Distributed Integrated Multimodal Sensing and Semantic Communications [5.646293779615063]
We propose a novel large language model (LLM)-driven distributed integrated multimodal sensing and semantic communication framework.<n>Specifically, our system consists of multiple collaborative sensing devices equipped with RF and camera modules.<n> evaluations on a synthetic multi-view RF-visual dataset generated by the Genesis simulation engine show that LLM-DiSAC achieves a good performance.
arXiv Detail & Related papers (2025-05-20T08:00:00Z)
Lightweight RGB-D Salient Object Detection from a Speed-Accuracy Tradeoff Perspective [54.91271106816616]
Current RGB-D methods usually leverage large-scale backbones to improve accuracy but sacrifice efficiency.<n>We propose a Speed-Accuracy Tradeoff Network (SATNet) for Lightweight RGB-D SOD from three fundamental perspectives.<n> Concerning depth quality, we introduce the Depth Anything Model to generate high-quality depth maps.<n>For modality fusion, we propose a Decoupled Attention Module (DAM) to explore the consistency within and between modalities.<n>For feature representation, we develop a Dual Information Representation Module (DIRM) with a bi-directional inverted framework.
arXiv Detail & Related papers (2025-05-07T19:37:20Z)
Resource-Efficient Beam Prediction in mmWave Communications with Multimodal Realistic Simulation Framework [57.994965436344195]
Beamforming is a key technology in millimeter-wave (mmWave) communications that improves signal transmission by optimizing directionality and intensity.<n> multimodal sensing-aided beam prediction has gained significant attention, using various sensing data to predict user locations or network conditions.<n>Despite its promising potential, the adoption of multimodal sensing-aided beam prediction is hindered by high computational complexity, high costs, and limited datasets.
arXiv Detail & Related papers (2025-04-07T15:38:25Z)
Multi-Modal Transformer and Reinforcement Learning-based Beam Management [10.728362890819392]
We propose a two-step beam management method by combining MMT with RL for dynamic beam index prediction. In this work, we divide available beam indices into several groups and leverage MMT to process diverse data modalities to predict the optimal beam group. Our proposed framework is tested on a 6G dataset.
arXiv Detail & Related papers (2024-10-22T21:44:25Z)
Beam Prediction based on Large Language Models [51.45077318268427]
We formulate the millimeter wave (mmWave) beam prediction problem as a time series forecasting task.<n>We transform historical observations into text-based representations using a trainable tokenizer.<n>Our method harnesses the power of LLMs to predict future optimal beams.
arXiv Detail & Related papers (2024-08-16T12:40:01Z)
WDMoE: Wireless Distributed Large Language Models with Mixture of Experts [65.57581050707738]
We propose a wireless distributed Large Language Models (LLMs) paradigm based on Mixture of Experts (MoE) We decompose the MoE layer in LLMs by deploying the gating network and the preceding neural network layer at base station (BS) and mobile devices. We design an expert selection policy by taking into account both the performance of the model and the end-to-end latency.
arXiv Detail & Related papers (2024-05-06T02:55:50Z)
Reliable Beamforming at Terahertz Bands: Are Causal Representations the Way Forward? [85.06664206117088]
Multi-user wireless systems can meet metaverse requirements by utilizing terahertz bandwidth with massive number of antennas. Existing solutions lack proper modeling of channel dynamics, resulting in inaccurate beamforming solutions in high-mobility scenarios. Herein, a dynamic, semantically aware beamforming solution is proposed for the first time, utilizing novel artificial intelligence algorithms in variational causal inference.
arXiv Detail & Related papers (2023-03-14T16:02:46Z)
Terahertz-Band Joint Ultra-Massive MIMO Radar-Communications: Model-Based and Model-Free Hybrid Beamforming [45.257328085051974]
Wireless communications and sensing at terahertz (THz) band are investigated as promising short-range technologies. Ultra-massive multiple-input multiple-output (UM-MIMO) antenna systems have been proposed for THz communications to compensate propagation losses. We develop THz hybrid beamformers based on both model-based and model-free techniques for a new group-of-subarrays (GoSA) UM-MIMO structure.
arXiv Detail & Related papers (2021-02-27T21:28:34Z)
Federated Dropout Learning for Hybrid Beamforming With Spatial Path Index Modulation In Multi-User mmWave-MIMO Systems [19.10321102094638]
We introduce model-based and model-free frameworks for beamformer design in SPIM-MIMO systems. The proposed framework exhibits higher spectral efficiency than the state-of-the-art SPIM-MIMO methods and mmWave-MIMO.
arXiv Detail & Related papers (2021-02-15T10:49:26Z)
M2Net: Multi-modal Multi-channel Network for Overall Survival Time Prediction of Brain Tumor Patients [151.4352001822956]
Early and accurate prediction of overall survival (OS) time can help to obtain better treatment planning for brain tumor patients. Existing prediction methods rely on radiomic features at the local lesion area of a magnetic resonance (MR) volume. We propose an end-to-end OS time prediction model; namely, Multi-modal Multi-channel Network (M2Net)
arXiv Detail & Related papers (2020-06-01T05:21:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.