Related papers: Orchestrating Multimodal DNN Workloads in Wireless Neural Processing

Orchestrating Multimodal DNN Workloads in Wireless Neural Processing

URL: http://arxiv.org/abs/2603.02109v1
Date: Mon, 02 Mar 2026 17:25:43 GMT
Title: Orchestrating Multimodal DNN Workloads in Wireless Neural Processing
Authors: Sai Xu, Kai-Kit Wong, Yanan Du, Hyundong Shin,
Abstract summary: In edge inference, wireless resource allocation and accelerator deep neural computation (DNN) scheduling have yet to be co-optimized in an end-to-end manner.<n>This paper investigates a paradigm that integrates wireless transmission and multi-core execution into a unified end-to-end pipeline.
Score: 57.510786937781866
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In edge inference, wireless resource allocation and accelerator-level deep neural network (DNN) scheduling have yet to be co-optimized in an end-to-end manner. The lack of coordination between wireless transmission and accelerator-level DNN execution prevents efficient overlap, leading to higher end-to-end inference latency. To address this issue, this paper investigates multimodal DNN workload orchestration in wireless neural processing (WNP), a paradigm that integrates wireless transmission and multi-core accelerator execution into a unified end-to-end pipeline. First, we develop a unified communication-computation model for multimodal DNN execution and formulate the corresponding optimization problem. Second, we propose O-WiN, a framework that orchestrates DNN workloads in WNP through two tightly coupled stages: simulation-based optimization and runtime execution. Third, we develop two algorithms, RTFS and PACS. RTFS schedules communication and computation sequentially, whereas PACS interleaves them to enable pipeline parallelism by overlapping wireless data transfer with accelerator-level DNN execution. Simulation results demonstrate that PACS significantly outperforms RTFS under high modality heterogeneity by better masking wireless latency through communication-computation overlap, thereby highlighting the effectiveness of communication-computation pipelining in accelerating multimodal DNN execution in WNP.

Related papers

Joint Optimization of Model Partitioning and Resource Allocation for Anti-Jamming Collaborative Inference Systems [52.842088497389746]
This letter focuses on an anti-jamming collaborative inference system in the presence of a malicious jammer.<n>We first analyze the effects of jamming and DNN partitioning on inference accuracy via data regression.<n>We propose an efficient alternating optimization-based algorithm, which decomposes the problem into three subproblems.
arXiv Detail & Related papers (2026-03-03T03:52:52Z)
Resource Allocation in Hybrid Radio-Optical IoT Networks using GNN with Multi-task Learning [11.833896722352568]
This paper addresses the problem of dual-technology scheduling in hybrid Internet of Things (IoT) networks that integrate Optical NeuralOWC and Radio Frequency (RF)<n>We propose a supervised multi-task learning architecture combining a two-stage Graph Embedding with Transformer (DGET) framework.<n>The proposed framework achieves near-optimal scheduling with over 90% classification accuracy, reduces computational complexity, and demonstrates higher robustness under partial channel observability.
arXiv Detail & Related papers (2025-10-29T15:02:28Z)
Intra-DP: A High Performance Collaborative Inference System for Mobile Edge Computing [67.98609858326951]
Intra-DP is a high-performance collaborative inference system optimized for deep neural networks (DNNs) on mobile devices.<n>It reduces per-inference latency by up to 50% and energy consumption by up to 75% compared to state-of-the-art baselines.<n>The evaluation demonstrates that Intra-DP reduces per-inference latency by up to 50% and energy consumption by up to 75% compared to state-of-the-art baselines.
arXiv Detail & Related papers (2025-07-08T09:50:57Z)
Communication-Efficient Federated Learning by Quantized Variance Reduction for Heterogeneous Wireless Edge Networks [55.467288506826755]
Federated learning (FL) has been recognized as a viable solution for local-privacy-aware collaborative model training in wireless edge networks.<n>Most existing communication-efficient FL algorithms fail to reduce the significant inter-device variance.<n>We propose a novel communication-efficient FL algorithm, named FedQVR, which relies on a sophisticated variance-reduced scheme.
arXiv Detail & Related papers (2025-01-20T04:26:21Z)
Neuromorphic Wireless Split Computing with Multi-Level Spikes [69.73249913506042]
Neuromorphic computing uses spiking neural networks (SNNs) to perform inference tasks.<n> embedding a small payload within each spike exchanged between spiking neurons can enhance inference accuracy without increasing energy consumption.<n> split computing - where an SNN is partitioned across two devices - is a promising solution.<n>This paper presents the first comprehensive study of a neuromorphic wireless split computing architecture that employs multi-level SNNs.
arXiv Detail & Related papers (2024-11-07T14:08:35Z)
Trustworthy DNN Partition for Blockchain-enabled Digital Twin in Wireless IIoT Networks [32.42557641803365]
Digital twin (DT) has emerged as a promising solution to enhance manufacturing efficiency in industrial Internet of Things (IIoT) networks. We propose a blockchain-enabled DT (B-DT) framework that employs deep neural network (DNN) partitioning technique and reputation-based consensus mechanism.
arXiv Detail & Related papers (2024-05-28T07:34:12Z)
Accurate and Efficient Event-based Semantic Segmentation Using Adaptive Spiking Encoder-Decoder Network [20.05283214295881]
Spiking neural networks (SNNs) are emerging as promising solutions for processing dynamic, asynchronous signals from event-based sensors. We develop an efficient spiking encoder-decoder network (SpikingEDN) for large-scale event-based semantic segmentation tasks. We harness the adaptive threshold which improves network accuracy, sparsity and robustness in streaming inference.
arXiv Detail & Related papers (2023-04-24T07:12:50Z)
Multi-Flow Transmission in Wireless Interference Networks: A Convergent Graph Learning Approach [9.852567834643292]
We introduce a novel algorithm called Dual-stage Interference-Aware Multi-flow Optimization of Network Data-signals (DIAMOND) A centralized stage computes the multi-flow transmission strategy using a novel design of graph neural network (GNN) reinforcement learning (RL) routing agent. Then, a distributed stage improves the performance based on a novel design of distributed learning updates.
arXiv Detail & Related papers (2023-03-27T18:49:47Z)
An Adaptive Device-Edge Co-Inference Framework Based on Soft Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices. We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations. Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.