Related papers: Multi-Modal Sensing Aided mmWave Beamforming for V2V Communications with Transformers

Multi-Modal Sensing Aided mmWave Beamforming for V2V Communications with Transformers

URL: http://arxiv.org/abs/2509.11112v1
Date: Sun, 14 Sep 2025 06:03:42 GMT
Title: Multi-Modal Sensing Aided mmWave Beamforming for V2V Communications with Transformers
Authors: Muhammad Baqer Mollah, Honggang Wang, Hua Fang,
Abstract summary: We present a multi-modal sensing and fusion learning framework as a potential alternative solution to reduce such overheads.<n>In this framework, we first extract the features individually from the visual and GPS coordinates sensing modalities by modality specific encoders.<n>We then fuse the multimodal features to obtain predicted top-k beams so that the best line-of-sight links can be proactively established.
Score: 1.9483189922830135
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Beamforming techniques are utilized in millimeter wave (mmWave) communication to address the inherent path loss limitation, thereby establishing and maintaining reliable connections. However, adopting standard defined beamforming approach in highly dynamic vehicular environments often incurs high beam training overheads and reduces the available airtime for communications, which is mainly due to exchanging pilot signals and exhaustive beam measurements. To this end, we present a multi-modal sensing and fusion learning framework as a potential alternative solution to reduce such overheads. In this framework, we first extract the features individually from the visual and GPS coordinates sensing modalities by modality specific encoders, and subsequently fuse the multimodal features to obtain predicted top-k beams so that the best line-of-sight links can be proactively established. To show the generalizability of the proposed framework, we perform a comprehensive experiment in four different vehicle-to-vehicle (V2V) scenarios from real-world multi-modal sensing and communication dataset. From the experiment, we observe that the proposed framework achieves up to 77.58% accuracy on predicting top-15 beams correctly, outperforms single modalities, incurs roughly as low as 2.32 dB average power loss, and considerably reduces the beam searching space overheads by 76.56% for top-15 beams with respect to standard defined approach.

Related papers

Multi-Modal Sensing and Fusion in mmWave Beamforming for Connected Vehicles: A Transformer Based Framework [1.7834756213254652]
We present a multi-modal sensing and fusion learning framework as a potential alternative solution to reduce such overheads.<n>In this framework, we first extract the representative features from the sensing modalities by modality specific encoders, then, utilize multi-head cross-modal attention to learn dependencies and correlations between different modalities.<n>The proposed framework achieves up to 96.72% accuracy on predicting top-15 beams correctly, (ii) incurs roughly 0.77 dB average power loss, and (iii) improves the overall latency and beam searching space overheads by 86.81% and 76.56% respectively.
arXiv Detail & Related papers (2026-02-14T05:12:06Z)
Empower Low-Altitude Economy: A Reliability-Aware Dynamic Weighting Allocation for Multi-modal UAV Beam Prediction [57.04985443535312]
Low-altitude economy (LAE) is rapidly expanding driven by urban air mobility, logistics drones, and aerial sensing.<n>Current research is shifting from single-signal to multi-modal collaborative approaches.<n>We propose a reliability-aware dynamic weighting scheme applied to a semantic-aware multi-modal beam prediction framework, named SaM2B.
arXiv Detail & Related papers (2025-12-30T16:24:34Z)
A Tri-Modal Dataset and a Baseline System for Tracking Unmanned Aerial Vehicles [74.8162337823142]
MM-UAV is the first large-scale benchmark for Multi-Modal UAV Tracking.<n>The dataset spans over 30 challenging scenarios, with 1,321 synchronised multi-modal sequences, and more than 2.8 million annotated frames.<n>Accompanying the dataset, we provide a novel multi-modal multi-UAV tracking framework.
arXiv Detail & Related papers (2025-11-23T08:42:17Z)
Multi-Modality Sensing in mmWave Beamforming for Connected Vehicles Using Deep Learning [2.2879063461015425]
This paper presents a deep learning-based solution for utilizing the multi-modality sensing data for predicting optimal beams having sufficient mmWave received powers.<n>The results show that it can achieve up to 98.19% accuracies while predicting top-13 beams.
arXiv Detail & Related papers (2025-04-08T16:18:00Z)
Resource-Efficient Beam Prediction in mmWave Communications with Multimodal Realistic Simulation Framework [57.994965436344195]
Beamforming is a key technology in millimeter-wave (mmWave) communications that improves signal transmission by optimizing directionality and intensity.<n> multimodal sensing-aided beam prediction has gained significant attention, using various sensing data to predict user locations or network conditions.<n>Despite its promising potential, the adoption of multimodal sensing-aided beam prediction is hindered by high computational complexity, high costs, and limited datasets.
arXiv Detail & Related papers (2025-04-07T15:38:25Z)
Position Aware 60 GHz mmWave Beamforming for V2V Communications Utilizing Deep Learning [2.4993733210446893]
This paper presents a deep learning-based solution on utilizing the vehicular position information for predicting the optimal beams having sufficient mmWave received powers. The results show that the solution can achieve up to 84.58% of received power of link status on average.
arXiv Detail & Related papers (2024-02-02T09:30:27Z)
MIMO-DBnet: Multi-channel Input and Multiple Outputs DOA-aware Beamforming Network for Speech Separation [55.533789120204055]
We propose an end-to-end beamforming network for direction guided speech separation given merely the mixture signal. Specifically, we design a multi-channel input and multiple outputs architecture to predict the direction-of-arrival based embeddings and beamforming weights for each source.
arXiv Detail & Related papers (2022-12-07T01:52:40Z)
Fast Beam Alignment via Pure Exploration in Multi-armed Bandits [91.11360914335384]
We develop a bandit-based fast BA algorithm to reduce BA latency for millimeter-wave (mmWave) communications. Our algorithm is named Two-Phase Heteroscedastic Track-and-Stop (2PHT&S)
arXiv Detail & Related papers (2022-10-23T05:57:39Z)
Inertial Hallucinations -- When Wearable Inertial Devices Start Seeing Things [82.15959827765325]
We propose a novel approach to multimodal sensor fusion for Ambient Assisted Living (AAL) We address two major shortcomings of standard multimodal approaches, limited area coverage and reduced reliability. Our new framework fuses the concept of modality hallucination with triplet learning to train a model with different modalities to handle missing sensors at inference time.
arXiv Detail & Related papers (2022-07-14T10:04:18Z)
Deep Learning on Multimodal Sensor Data at the Wireless Edge for Vehicular Network [8.458980329342799]
We propose a novel expediting beam selection by leveraging multimodal data collected from sensors like LiDAR, camera images, and GPS. We propose individual and distributed fusion-based deep learning (F-DL) architectures that can execute locally as well as at a mobile edge computing center. Results from extensive evaluations conducted on publicly available synthetic and home-grown real-world datasets reveal 95% and 96% improvement in beam selection speed over classical RF-only beam sweeping.
arXiv Detail & Related papers (2022-01-12T21:55:34Z)
A Novel Look at LIDAR-aided Data-driven mmWave Beam Selection [24.711393214172148]
We propose a lightweight neural network (NN) architecture along with the corresponding LIDAR preprocessing. Our NN-based beam selection scheme can achieve 79.9% throughput without any beam search overhead and 95% by searching among as few as 6 beams.
arXiv Detail & Related papers (2021-04-29T18:07:31Z)
Deep Learning-based Compressive Beam Alignment in mmWave Vehicular Systems [75.77033270838926]
vehicular channels exhibit structure that can be exploited for beam alignment with fewer channel measurements. We propose a deep learning-based technique to design a structured compressed sensing (CS) matrix.
arXiv Detail & Related papers (2021-02-27T04:38:12Z)
Codebook-Based Beam Tracking for Conformal ArrayEnabled UAV MmWave Networks [33.52271582081627]
Millimeter wave (mmWave) communications can potentially meet the high data-rate requirements of unmanned aerial vehicle (UAV) networks. As the prerequisite of mmWave communications, the narrow directional beam tracking is very challenging because of the three-dimensional (3D) mobility and attitude variation of UAVs. We propose to integrate the conformal array with the surface of each UAV, which enables the full spatial coverage and the agile beam tracking in highly dynamic UAV mmWave networks.
arXiv Detail & Related papers (2020-05-28T14:57:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.