Empower Low-Altitude Economy: A Reliability-Aware Dynamic Weighting Allocation for Multi-modal UAV Beam Prediction
- URL: http://arxiv.org/abs/2512.24324v1
- Date: Tue, 30 Dec 2025 16:24:34 GMT
- Title: Empower Low-Altitude Economy: A Reliability-Aware Dynamic Weighting Allocation for Multi-modal UAV Beam Prediction
- Authors: Haojin Li, Anbang Zhang, Chen Sun, Chenyuan Feng, Kaiqian Qu, Tony Q. S. Quek, Haijun Zhang,
- Abstract summary: Low-altitude economy (LAE) is rapidly expanding driven by urban air mobility, logistics drones, and aerial sensing.<n>Current research is shifting from single-signal to multi-modal collaborative approaches.<n>We propose a reliability-aware dynamic weighting scheme applied to a semantic-aware multi-modal beam prediction framework, named SaM2B.
- Score: 57.04985443535312
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The low-altitude economy (LAE) is rapidly expanding driven by urban air mobility, logistics drones, and aerial sensing, while fast and accurate beam prediction in uncrewed aerial vehicles (UAVs) communications is crucial for achieving reliable connectivity. Current research is shifting from single-signal to multi-modal collaborative approaches. However, existing multi-modal methods mostly employ fixed or empirical weights, assuming equal reliability across modalities at any given moment. Indeed, the importance of different modalities fluctuates dramatically with UAV motion scenarios, and static weighting amplifies the negative impact of degraded modalities. Furthermore, modal mismatch and weak alignment further undermine cross-scenario generalization. To this end, we propose a reliability-aware dynamic weighting scheme applied to a semantic-aware multi-modal beam prediction framework, named SaM2B. Specifically, SaM2B leverages lightweight cues such as environmental visual, flight posture, and geospatial data to adaptively allocate contributions across modalities at different time points through reliability-aware dynamic weight updates. Moreover, by utilizing cross-modal contrastive learning, we align the "multi-source representation beam semantics" associated with specific beam information to a shared semantic space, thereby enhancing discriminative power and robustness under modal noise and distribution shifts. Experiments on real-world low-altitude UAV datasets show that SaM2B achieves more satisfactory results than baseline methods.
Related papers
- Learning Multi-Modal Mobility Dynamics for Generalized Next Location Recommendation [51.00494428978262]
We leverage multi-modal spatial-temporal knowledge to characterize mobility dynamics for the location recommendation task.<n>First, we construct a unified spatial-temporal relational graph (STRG) for multi-modal representation.<n>Second, we design a gating mechanism to fuse spatial-temporal graph representations of different modalities.
arXiv Detail & Related papers (2025-12-27T14:23:04Z) - Steering Vision-Language-Action Models as Anti-Exploration: A Test-Time Scaling Approach [78.4812458793128]
We propose textbfTACO, a test-time-scaling framework that applies a lightweight pseudo-count estimator as a high-fidelity verifier of action chunks.<n>Our method resembles the classical anti-exploration principle in offline reinforcement learning (RL), and being gradient-free, it incurs significant computational benefits.
arXiv Detail & Related papers (2025-12-02T14:42:54Z) - Multimodal Mixture-of-Experts for ISAC in Low-Altitude Wireless Networks [33.986844210423996]
Integrated sensing and communication (ISAC) is a key enabler for low-altitude wireless networks (LAWNs)<n>We propose a mixture-of-experts (MoE) framework for multimodal ISAC in LAWNs.<n>To improve scalability under the stringent energy constraints of aerial platforms, we develop a sparse MoE variant that selectively activates only a subset of experts.
arXiv Detail & Related papers (2025-12-01T14:53:29Z) - A Tri-Modal Dataset and a Baseline System for Tracking Unmanned Aerial Vehicles [74.8162337823142]
MM-UAV is the first large-scale benchmark for Multi-Modal UAV Tracking.<n>The dataset spans over 30 challenging scenarios, with 1,321 synchronised multi-modal sequences, and more than 2.8 million annotated frames.<n>Accompanying the dataset, we provide a novel multi-modal multi-UAV tracking framework.
arXiv Detail & Related papers (2025-11-23T08:42:17Z) - Multimodal Trajectory Representation Learning for Travel Time Estimation [15.25848441558445]
This paper introduces the Multimodal Dynamic Trajectory Integration framework.<n>It integrates GPS sequences, grid trajectories, and road network constraints to enhance TTE accuracy.<n>It consistently outperforms state-of-the-art baselines in experiments on three real-world datasets.
arXiv Detail & Related papers (2025-10-07T12:04:16Z) - MoCa: Modality-aware Continual Pre-training Makes Better Bidirectional Multimodal Embeddings [75.0617088717528]
MoCa is a framework for transforming pre-trained VLM backbones into effective bidirectional embedding models.<n>MoCa consistently improves performance across MMEB and ViDoRe-v2 benchmarks, achieving new state-of-the-art results.
arXiv Detail & Related papers (2025-06-29T06:41:00Z) - Learning to Fuse: Modality-Aware Adaptive Scheduling for Robust Multimodal Foundation Models [0.0]
Modality-Aware Adaptive Fusion Scheduling (MA-AFS) learns to dynamically modulate the contribution of each modality on a per-instance basis.<n>Our work highlights the importance of adaptive fusion and opens a promising direction toward reliable and uncertainty-aware multimodal learning.
arXiv Detail & Related papers (2025-06-15T05:57:45Z) - Asymmetric Reinforcing against Multi-modal Representation Bias [59.685072206359855]
We propose an Asymmetric Reinforcing method against Multimodal representation bias (ARM)<n>Our ARM dynamically reinforces the weak modalities while maintaining the ability to represent dominant modalities through conditional mutual information.<n>We have significantly improved the performance of multimodal learning, making notable progress in mitigating imbalanced multimodal learning.
arXiv Detail & Related papers (2025-01-02T13:00:06Z) - Improving Autonomous Separation Assurance through Distributed
Reinforcement Learning with Attention Networks [0.0]
We present a reinforcement learning framework to provide autonomous self-separation capabilities within AAM corridors.
The problem is formulated as a Markov Decision Process and solved by developing a novel extension to the sample-efficient, off-policy soft actor-critic (SAC) algorithm.
A comprehensive numerical study shows that the proposed framework can ensure safe and efficient separation of aircraft in high density, dynamic environments.
arXiv Detail & Related papers (2023-08-09T13:44:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.