Related papers: Dynamic Split Computing for Efficient Deep Edge Intelligence

Dynamic Split Computing for Efficient Deep Edge Intelligence

URL: http://arxiv.org/abs/2205.11269v1
Date: Mon, 23 May 2022 12:35:18 GMT
Title: Dynamic Split Computing for Efficient Deep Edge Intelligence
Authors: Arian Bakhtiarnia, Nemanja Milo\v{s}evi\'c, Qi Zhang, Dragana Bajovi\'c, Alexandros Iosifidis
Abstract summary: We introduce dynamic split computing, where the optimal split location is dynamically selected based on the state of the communication channel. We show that dynamic split computing achieves faster inference in edge computing environments where the data rate and server load vary over time.
Score: 78.4233915447056
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deploying deep neural networks (DNNs) on IoT and mobile devices is a challenging task due to their limited computational resources. Thus, demanding tasks are often entirely offloaded to edge servers which can accelerate inference, however, it also causes communication cost and evokes privacy concerns. In addition, this approach leaves the computational capacity of end devices unused. Split computing is a paradigm where a DNN is split into two sections; the first section is executed on the end device, and the output is transmitted to the edge server where the final section is executed. Here, we introduce dynamic split computing, where the optimal split location is dynamically selected based on the state of the communication channel. By using natural bottlenecks that already exist in modern DNN architectures, dynamic split computing avoids retraining and hyperparameter optimization, and does not have any negative impact on the final accuracy of DNNs. Through extensive experiments, we show that dynamic split computing achieves faster inference in edge computing environments where the data rate and server load vary over time.

Related papers

Neuromorphic Wireless Split Computing with Multi-Level Spikes [69.73249913506042]
Neuromorphic computing uses spiking neural networks (SNNs) to perform inference tasks. embedding a small payload within each spike exchanged between spiking neurons can enhance inference accuracy without increasing energy consumption. split computing - where an SNN is partitioned across two devices - is a promising solution. This paper presents the first comprehensive study of a neuromorphic wireless split computing architecture that employs multi-level SNNs.
arXiv Detail & Related papers (2024-11-07T14:08:35Z)
I-SplitEE: Image classification in Split Computing DNNs with Early Exits [5.402030962296633]
Large size of Deep Neural Networks (DNNs) hinders deploying them on resource-constrained devices like edge, mobile, and IoT platforms. Our work presents an innovative unified approach merging early exits and split computing. I-SplitEE is an online unsupervised algorithm ideal for scenarios lacking ground truths and with sequential data.
arXiv Detail & Related papers (2024-01-19T07:44:32Z)
A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs) MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z)
Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency. We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z)
A Low-Complexity Approach to Rate-Distortion Optimized Variable Bit-Rate Compression for Split DNN Computing [5.3221129103999125]
Split computing has emerged as a recent paradigm for implementation of DNN-based AI workloads. We present an approach that addresses the challenge of optimizing the rate-accuracy-complexity trade-off. Our approach is remarkably lightweight, both during training and inference, highly effective and achieves excellent rate-distortion performance.
arXiv Detail & Related papers (2022-08-24T15:02:11Z)
Receptive Field-based Segmentation for Distributed CNN Inference Acceleration in Collaborative Edge Computing [93.67044879636093]
We study inference acceleration using distributed convolutional neural networks (CNNs) in collaborative edge computing network. We propose a novel collaborative edge computing using fused-layer parallelization to partition a CNN model into multiple blocks of convolutional layers.
arXiv Detail & Related papers (2022-07-22T18:38:11Z)
An Adaptive Device-Edge Co-Inference Framework Based on Soft Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices. We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations. Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z)
Dynamic DNN Decomposition for Lossless Synergistic Inference [0.9549013615433989]
Deep neural networks (DNNs) sustain high performance in today's data processing applications. We propose D3, a dynamic DNN decomposition system for synergistic inference without precision loss. D3 outperforms the state-of-the-art counterparts up to 3.4 times in end-to-end DNN inference time and reduces backbone network communication overhead up to 3.68 times.
arXiv Detail & Related papers (2021-01-15T03:18:53Z)
Neural Compression and Filtering for Edge-assisted Real-time Object Detection in Challenged Networks [8.291242737118482]
We focus on edge computing supporting remote object detection by means of Deep Neural Networks (DNNs) We develop a framework to reduce the amount of data transmitted over the wireless link. The proposed technique represents an effective intermediate option between local and edge computing in a parameter region.
arXiv Detail & Related papers (2020-07-31T03:11:46Z)
PatDNN: Achieving Real-Time DNN Execution on Mobile Devices with Pattern-based Weight Pruning [57.20262984116752]
We introduce a new dimension, fine-grained pruning patterns inside the coarse-grained structures, revealing a previously unknown point in design space. With the higher accuracy enabled by fine-grained pruning patterns, the unique insight is to use the compiler to re-gain and guarantee high hardware efficiency.
arXiv Detail & Related papers (2020-01-01T04:52:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.