Real-Time Inference for Distributed Multimodal Systems under Communication Delay Uncertainty
- URL: http://arxiv.org/abs/2511.16225v1
- Date: Thu, 20 Nov 2025 10:48:54 GMT
- Title: Real-Time Inference for Distributed Multimodal Systems under Communication Delay Uncertainty
- Authors: Victor Croisfelt, João Henrique Inacio de Souza, Shashi Raj Pandey, Beatriz Soret, Petar Popovski,
- Abstract summary: Connected cyber-physical systems perform inference based on real-time inputs from multiple data streams.<n>We propose a novel neuro-inspired non-blocking inference paradigm that employs adaptive temporal windows of integration.<n>Our framework achieves robust real-time inference with finer-grained control over the accuracy-latency tradeoff.
- Score: 37.15356899831919
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Connected cyber-physical systems perform inference based on real-time inputs from multiple data streams. Uncertain communication delays across data streams challenge the temporal flow of the inference process. State-of-the-art (SotA) non-blocking inference methods rely on a reference-modality paradigm, requiring one modality input to be fully received before processing, while depending on costly offline profiling. We propose a novel, neuro-inspired non-blocking inference paradigm that primarily employs adaptive temporal windows of integration (TWIs) to dynamically adjust to stochastic delay patterns across heterogeneous streams while relaxing the reference-modality requirement. Our communication-delay-aware framework achieves robust real-time inference with finer-grained control over the accuracy-latency tradeoff. Experiments on the audio-visual event localization (AVEL) task demonstrate superior adaptability to network dynamics compared to SotA approaches.
Related papers
- Wireless Federated Multi-Task LLM Fine-Tuning via Sparse-and-Orthogonal LoRA [61.12136997430116]
Decentralized federated learning (DFL) based on low-rank adaptation (LoRA) enables mobile devices with multi-task datasets to collaboratively fine-tune a large language model (LLM) by exchanging locally updated parameters with a subset of neighboring devices via wireless connections for knowledge integration.<n> directly aggregating parameters fine-tuned on heterogeneous datasets induces three primary issues across the DFL life-cycle: (i) catastrophic knowledge forgetting during fine-tuning process, arising from conflicting update directions caused by data heterogeneity; (ii) textitinefficient communication and convergence during model aggregation process,
arXiv Detail & Related papers (2026-02-24T02:45:32Z) - Communication-Efficient Multi-Modal Edge Inference via Uncertainty-Aware Distributed Learning [60.650628083185616]
We propose a three-stage communication-aware distributed learning framework to improve training and inference efficiency.<n>In StageI, devices perform local multi-modal self-supervised learning to obtain shared and modality-specific encoders without device--server exchange.<n>StageII, distributed fine-tuning with centralized evidential fusion calibrates per-modality uncertainty and reliably aggregates features distorted by noise or channel fading.<n>StageIII, an uncertainty-guided feedback mechanism selectively requests additional features for uncertain samples, optimizing the communication--accuracy tradeoff in the distributed setting.
arXiv Detail & Related papers (2026-01-21T12:38:02Z) - FeedbackSTS-Det: Sparse Frames-Based Spatio-Temporal Semantic Feedback Network for Infrared Small Target Detection [7.648318265124807]
Infrared small detection target (ISTD) under complex backgrounds remains a challenging task.<n>Existing methods still struggle with inefficient long-range dependency modeling.<n>We propose a novel scheme for ISTD detection through a sparse semantic-temporal feedback network.
arXiv Detail & Related papers (2026-01-21T06:06:36Z) - SG-RIFE: Semantic-Guided Real-Time Intermediate Flow Estimation with Diffusion-Competitive Perceptual Quality [0.0]
Real-time Video Frame Interpolation (VFI) has long been dominated by flow-based methods like RIFE.<n>Recent diffusion-based approaches achieve state-of-the-art perceptual quality but suffer from prohibitive latency, rendering them impractical for real-time applications.<n>We propose Semantic-Guided RIFE (SG-RIFE), which augments a pre-trained RIFE backbone with semantic priors from a frozen DINOv3 Vision Transformer.
arXiv Detail & Related papers (2025-12-20T06:50:55Z) - UniDiff: A Unified Diffusion Framework for Multimodal Time Series Forecasting [90.47915032778366]
We propose UniDiff, a unified diffusion framework for multimodal time series forecasting.<n>At its core lies a unified and parallel fusion module, where a single cross-attention mechanism integrates structural information from timestamps and semantic context from texts.<n>Experiments on real-world benchmark datasets across eight domains demonstrate that the proposed UniDiff model achieves state-of-the-art performance.
arXiv Detail & Related papers (2025-12-08T05:36:14Z) - MMEdge: Accelerating On-device Multimodal Inference via Pipelined Sensing and Encoding [1.6572113577265137]
We propose MMEdge, a new on-device multi-modal inference framework based on pipelined sensing and encoding.<n>Instead of waiting for complete sensor inputs, MMEdge decomposes the entire inference process into a sequence of fine-grained sensing and encoding units.<n> MMEdge significantly reduces end-to-end latency while maintaining high task accuracy across various system and data dynamics.
arXiv Detail & Related papers (2025-10-29T09:41:03Z) - CSGO: Generalized Optimization for Cold Start in Wireless Collaborative Edge LLM Systems [62.24576366776727]
We propose a latency-aware scheduling framework to minimize total inference latency.<n>We show that the proposed method significantly reduces cold-start latency compared to baseline strategies.
arXiv Detail & Related papers (2025-08-15T07:49:22Z) - Learning Unified System Representations for Microservice Tail Latency Prediction [8.532290784939967]
Microservice architectures have become the de facto standard for building scalable cloud-native applications.<n>Traditional approaches often rely on per-request latency metrics, which are highly sensitive to transient noise.<n>We propose USRFNet, a deep learning network that explicitly separates and models traffic-side and resource-side features.
arXiv Detail & Related papers (2025-08-03T07:46:23Z) - Latent Diffusion Model Based Denoising Receiver for 6G Semantic Communication: From Stochastic Differential Theory to Application [11.385703484113552]
We propose a novel semantic communication framework empowered by generative artificial intelligence (GAI)<n>A latent diffusion model (LDM)-based semantic communication framework is proposed that combines a variational autoencoder for semantic features extraction.<n>The proposed system is a training-free framework that supports zero-shot generalization, and achieves superior performance under low-SNR and out-of-distribution conditions.
arXiv Detail & Related papers (2025-06-06T03:20:32Z) - AdaFlow: Opportunistic Inference on Asynchronous Mobile Data with Generalized Affinity Control [16.944584145880793]
AdaFlow pioneers the formulation of structured cross-modality affinity in mobile contexts using a hierarchical analysis-based normalized matrix.
AdaFlow significantly reduces inference latency by up to 79.9% and enhances accuracy by up to 61.9%, outperforming status quo approaches.
arXiv Detail & Related papers (2024-10-31T15:28:22Z) - Asymmetric Diffusion Based Channel-Adaptive Secure Wireless Semantic
Communications [5.539381022630274]
We propose a secure semantic communication system, DiffuSeC, which leverages the diffusion model and deep reinforcement learning (DRL)
With the diffusing module in the sender end and the asymmetric denoising module in the receiver end, the DiffuSeC mitigates the perturbations added by semantic attacks.
To further improve the robustness under unstable channel conditions caused by semantic attacks, we developed a DRL-based channel-adaptive diffusion step selection scheme.
arXiv Detail & Related papers (2023-10-30T11:00:47Z) - MTD: Multi-Timestep Detector for Delayed Streaming Perception [0.5439020425819]
Streaming perception is a task of reporting the current state of the world, which is used to evaluate the delay and accuracy of autonomous driving systems.
This paper propose the Multi- Timestep Detector (MTD), an end-to-end detector which uses dynamic routing for multi-branch future prediction.
The proposed method has been evaluated on the Argoverse-HD dataset, and the experimental results show that it has achieved state-of-the-art performance across various delay settings.
arXiv Detail & Related papers (2023-09-13T06:23:58Z) - Age of Semantics in Cooperative Communications: To Expedite Simulation
Towards Real via Offline Reinforcement Learning [53.18060442931179]
We propose the age of semantics (AoS) for measuring semantics freshness of status updates in a cooperative relay communication system.
We derive an online deep actor-critic (DAC) learning scheme under the on-policy temporal difference learning framework.
We then put forward a novel offline DAC scheme, which estimates the optimal control policy from a previously collected dataset.
arXiv Detail & Related papers (2022-09-19T11:55:28Z) - Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge
Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC)
We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer.
Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.