6G WavesFM: A Foundation Model for Sensing, Communication, and Localization
- URL: http://arxiv.org/abs/2504.14100v1
- Date: Fri, 18 Apr 2025 22:51:35 GMT
- Title: 6G WavesFM: A Foundation Model for Sensing, Communication, and Localization
- Authors: Ahmed Aboulfotouh, Elsayed Mohammed, Hatem Abou-Zeid,
- Abstract summary: This paper introduces a novel Wireless Foundation Model (WFM) framework, capable of supporting a wide array of communication, sensing, and localization tasks.<n>Our proposed architecture combines a shared Vision Transformer (ViT) backbone with task-specific multi-layer perceptron heads and incorporates Low-Rank Adaptation (LoRA) for parameter-efficient fine-tuning.<n>We show that our unified WFM can support diverse tasks and deliver significant gains in both performance and efficiency.
- Score: 6.70088826174291
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces WavesFM, a novel Wireless Foundation Model (WFM) framework, capable of supporting a wide array of communication, sensing, and localization tasks. Our proposed architecture combines a shared Vision Transformer (ViT) backbone with task-specific multi-layer perceptron (MLP) heads and incorporates Low-Rank Adaptation (LoRA) for parameter-efficient fine-tuning. This design promotes full parameter sharing across tasks, significantly reducing the computational and memory footprint without sacrificing performance. The model processes both image-like wireless modalities, such as spectrograms and channel state information (CSI), and in-phase and quadrature (IQ) signals arranged as orthogonal frequency-division multiplexing (OFDM) resource grids. We demonstrate the strong generalization capabilities of WavesFM through extensive experiments on four downstream tasks: Fifth Generation New Radio (5G NR) positioning; multiple-input multiple-output OFDM (MIMO-OFDM) channel estimation; human activity sensing; and radio-frequency (RF) signal classification. Compared to supervised baselines trained individually, our approach achieves superior performance while sharing 80% of its parameters across tasks. Furthermore, we show that pretraining on domain-relevant data not only boosts performance but also accelerates convergence, reducing training time by up to 5x. These results demonstrate that our unified WFM can support diverse tasks and deliver significant gains in both performance and efficiency, highlighting the transformative potential of foundation models to drive AI-native paradigms in future sixth-generation (6G) networks.
Related papers
- GDM4MMIMO: Generative Diffusion Models for Massive MIMO Communications [61.56610953012228]
generative diffusion model (GDM) is one of state-of-the-art families of generative models.<n>GDM demonstrates exceptional capability to learn implicit prior knowledge and robust generalization capabilities.<n>Case study shows GDM's promising potential for facilitating efficient ultra-dimensional channel statement information acquisition.
arXiv Detail & Related papers (2024-12-24T08:42:01Z) - RadioDiff: An Effective Generative Diffusion Model for Sampling-Free Dynamic Radio Map Construction [42.596399621642234]
Radio map (RM) is a promising technology that can obtain pathloss based on only location.
In this paper, a sampling-free RM construction is modeled as a conditional generative problem, where a denoised diffusion-based method, named RadioDiff, is proposed to achieve high-quality RM construction.
Experimental results show that the proposed RadioDiff achieves state-of-the-art performance in all three metrics of accuracy, structural similarity, and peak signal-to-noise ratio.
arXiv Detail & Related papers (2024-08-16T08:02:00Z) - MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild [81.32127423981426]
Multimodal emotion recognition based on audio and video data is important for real-world applications.
Recent methods have focused on exploiting advances of self-supervised learning (SSL) for pre-training of strong multimodal encoders.
We propose a different perspective on the problem and investigate the advancement of multimodal DFER performance by adapting SSL-pre-trained disjoint unimodal encoders.
arXiv Detail & Related papers (2024-04-13T13:39:26Z) - Diffusion Model is an Effective Planner and Data Synthesizer for
Multi-Task Reinforcement Learning [101.66860222415512]
Multi-Task Diffusion Model (textscMTDiff) is a diffusion-based method that incorporates Transformer backbones and prompt learning for generative planning and data synthesis.
For generative planning, we find textscMTDiff outperforms state-of-the-art algorithms across 50 tasks on Meta-World and 8 maps on Maze2D.
arXiv Detail & Related papers (2023-05-29T05:20:38Z) - Multi-Flow Transmission in Wireless Interference Networks: A Convergent
Graph Learning Approach [9.852567834643292]
We introduce a novel algorithm called Dual-stage Interference-Aware Multi-flow Optimization of Network Data-signals (DIAMOND)
A centralized stage computes the multi-flow transmission strategy using a novel design of graph neural network (GNN) reinforcement learning (RL) routing agent.
Then, a distributed stage improves the performance based on a novel design of distributed learning updates.
arXiv Detail & Related papers (2023-03-27T18:49:47Z) - On Neural Architectures for Deep Learning-based Source Separation of
Co-Channel OFDM Signals [104.11663769306566]
We study the single-channel source separation problem involving frequency-division multiplexing (OFDM) signals.
We propose critical domain-informed modifications to the network parameterization, based on insights from OFDM structures.
arXiv Detail & Related papers (2023-03-11T16:29:13Z) - Over-the-Air Federated Multi-Task Learning via Model Sparsification and
Turbo Compressed Sensing [48.19771515107681]
We propose an over-the-air FMTL framework, where multiple learning tasks deployed on edge devices share a non-orthogonal fading channel under the coordination of an edge server.
In OA-FMTL, the local updates of edge devices are sparsified, compressed, and then sent over the uplink channel in a superimposed fashion.
We analyze the performance of the proposed OA-FMTL framework together with the M-Turbo-CS algorithm.
arXiv Detail & Related papers (2022-05-08T08:03:52Z) - Multi-task Learning Approach for Modulation and Wireless Signal
Classification for 5G and Beyond: Edge Deployment via Model Compression [1.218340575383456]
Future communication networks must address the scarce spectrum to accommodate growth of heterogeneous wireless devices.
We exploit the potential of deep neural networks based multi-task learning framework to simultaneously learn modulation and signal classification tasks.
We provide a comprehensive heterogeneous wireless signals dataset for public use.
arXiv Detail & Related papers (2022-02-26T14:51:02Z) - Learning OFDM Waveforms with PAPR and ACLR Constraints [15.423422040627331]
We propose a learning-based method to design OFDM-based waveforms that satisfy selected constraints while maximizing an achievable information rate.
We show that the end-to-end system is able to satisfy target PAPR and ACLR constraints and allows significant throughput gains.
arXiv Detail & Related papers (2021-10-21T08:58:59Z) - Multi-task Learning Approach for Automatic Modulation and Wireless
Signal Classification [1.827510863075184]
We exploit the potential of deep neural networks in conjunction with multi-task learning (MTL) framework to simultaneously learn modulation and signal classification tasks.
We release the only known open heterogeneous wireless signals dataset that comprises of radar and communication signals with multiple labels.
arXiv Detail & Related papers (2021-01-25T17:43:42Z) - Modality Compensation Network: Cross-Modal Adaptation for Action
Recognition [77.24983234113957]
We propose a Modality Compensation Network (MCN) to explore the relationships of different modalities.
Our model bridges data from source and auxiliary modalities by a modality adaptation block to achieve adaptive representation learning.
Experimental results reveal that MCN outperforms state-of-the-art approaches on four widely-used action recognition benchmarks.
arXiv Detail & Related papers (2020-01-31T04:51:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.