Related papers: Communication-Efficient Multi-Modal Edge Inference via Uncertainty-Aware Distributed Learning

Communication-Efficient Multi-Modal Edge Inference via Uncertainty-Aware Distributed Learning

URL: http://arxiv.org/abs/2601.14942v1
Date: Wed, 21 Jan 2026 12:38:02 GMT
Title: Communication-Efficient Multi-Modal Edge Inference via Uncertainty-Aware Distributed Learning
Authors: Hang Zhao, Hongru Li, Dongfang Xu, Shenghui Song, Khaled B. Letaief,
Abstract summary: We propose a three-stage communication-aware distributed learning framework to improve training and inference efficiency.<n>In StageI, devices perform local multi-modal self-supervised learning to obtain shared and modality-specific encoders without device--server exchange.<n>StageII, distributed fine-tuning with centralized evidential fusion calibrates per-modality uncertainty and reliably aggregates features distorted by noise or channel fading.<n>StageIII, an uncertainty-guided feedback mechanism selectively requests additional features for uncertain samples, optimizing the communication--accuracy tradeoff in the distributed setting.
Score: 60.650628083185616
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Semantic communication is emerging as a key enabler for distributed edge intelligence due to its capability to convey task-relevant meaning. However, achieving communication-efficient training and robust inference over wireless links remains challenging. This challenge is further exacerbated for multi-modal edge inference (MMEI) by two factors: 1) prohibitive communication overhead for distributed learning over bandwidth-limited wireless links, due to the \emph{multi-modal} nature of the system; and 2) limited robustness under varying channels and noisy multi-modal inputs. In this paper, we propose a three-stage communication-aware distributed learning framework to improve training and inference efficiency while maintaining robustness over wireless channels. In Stage~I, devices perform local multi-modal self-supervised learning to obtain shared and modality-specific encoders without device--server exchange, thereby reducing the communication cost. In Stage~II, distributed fine-tuning with centralized evidential fusion calibrates per-modality uncertainty and reliably aggregates features distorted by noise or channel fading. In Stage~III, an uncertainty-guided feedback mechanism selectively requests additional features for uncertain samples, optimizing the communication--accuracy tradeoff in the distributed setting. Experiments on RGB--depth indoor scene classification show that the proposed framework attains higher accuracy with far fewer training communication rounds and remains robust to modality degradation or channel variation, outperforming existing self-supervised and fully supervised baselines.

Related papers

Send Less, Perceive More: Masked Quantized Point Cloud Communication for Loss-Tolerant Collaborative Perception [38.10779821259225]
We introduce QPoint2Comm, a quantized point-cloud communication framework that dramatically reduces bandwidth.<n>QPoint2Comm directly communicates quantized point-cloud indices using a shared codebook.<n>We employ a masked training strategy that simulates random packet loss, allowing the model to maintain strong performance even under severe transmission failures.
arXiv Detail & Related papers (2026-02-25T08:00:48Z)
Event-Triggered Gossip for Distributed Learning [61.70659996356528]
We develop a new event-triggered gossip framework for distributed learning to reduce inter-node communication.<n>We analyze bf71.61% with only a marginal performance loss, compared with the conventional full-text-of-the-art distributed learning methods.
arXiv Detail & Related papers (2026-02-22T10:13:43Z)
Robust and Efficient Communication in Multi-Agent Reinforcement Learning [18.405707681765453]
Multi-agent reinforcement learning (MARL) has made significant strides in enabling coordinated behaviors among autonomous agents.<n>Most existing approaches assume that communication is instantaneous, reliable, and has unlimited bandwidth; these conditions are rarely met in real-world deployments.<n>This survey systematically reviews recent advances in robust and efficient communication strategies for MARL under realistic constraints.
arXiv Detail & Related papers (2025-11-14T15:23:11Z)
RIS-aided Latent Space Alignment for Semantic Channel Equalization [10.555901476981923]
We introduce a new paradigm in wireless communications, focusing on transmitting the intended meaning rather than ensuring strict bit-level accuracy.<n>These systems often rely on Deep Neural Networks (DNNs) to learn and encode meaning directly from data, enabling more efficient communication.<n>In this work, we propose a joint physical and semantic channel equalization framework that leverages the presence of Reconfigurable Intelligent Surfaces (RIS)<n>We show that the proposed joint equalization strategies consistently outperform conventional, disjoint approaches to physical and semantic channel equalization across a broad range of scenarios and wireless channel conditions.
arXiv Detail & Related papers (2025-07-22T10:51:35Z)
Distributionally Robust Wireless Semantic Communication with Large AI Models [111.47794569742206]
Current SemCom systems fail to generalize across diverse noise conditions, adversarial attacks, and out-of-distribution data.<n>Wasserstein distributionally robust optimization is employed to provide resilience against semantic misinterpretation and channel perturbations.<n> Experimental results on image and text transmission demonstrate that WaSeCom achieves improved robustness under noise and adversarial perturbations.
arXiv Detail & Related papers (2025-05-28T04:03:57Z)
Multi-Modal Self-Supervised Semantic Communication [52.76990720898666]
We propose a multi-modal semantic communication system that leverages multi-modal self-supervised learning to enhance task-agnostic feature extraction.<n>The proposed approach effectively captures both modality-invariant and modality-specific features while minimizing training-related communication overhead.<n>The findings underscore the advantages of multi-modal self-supervised learning in semantic communication, paving the way for more efficient and scalable edge inference systems.
arXiv Detail & Related papers (2025-03-18T06:13:02Z)
Semantic Communication for Cooperative Perception using HARQ [51.148203799109304]
We leverage an importance map to distill critical semantic information, introducing a cooperative perception semantic communication framework. To counter the challenges posed by time-varying multipath fading, our approach incorporates the use of frequency-division multiplexing (OFDM) along with channel estimation and equalization strategies. We introduce a novel semantic error detection method that is integrated with our semantic communication framework in the spirit of hybrid automatic repeated request (HARQ)
arXiv Detail & Related papers (2024-08-29T08:53:26Z)
Latent Diffusion Model-Enabled Low-Latency Semantic Communication in the Presence of Semantic Ambiguities and Wireless Channel Noises [18.539501941328393]
This paper develops a latent diffusion model-enabled SemCom system to handle outliers in source data.<n>A lightweight single-layer latent space transformation adapter completes one-shot learning at the transmitter.<n>An end-to-end consistency distillation strategy is used to distill the diffusion models trained in latent space.
arXiv Detail & Related papers (2024-06-09T23:39:31Z)
Low-Latency Federated Learning over Wireless Channels with Differential Privacy [142.5983499872664]
In federated learning (FL), model training is distributed over clients and local models are aggregated by a central server. In this paper, we aim to minimize FL training delay over wireless channels, constrained by overall training performance as well as each client's differential privacy (DP) requirement.
arXiv Detail & Related papers (2021-06-20T13:51:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.