Dynamic Split Computing for Efficient Deep Edge Intelligence
- URL: http://arxiv.org/abs/2205.11269v1
- Date: Mon, 23 May 2022 12:35:18 GMT
- Title: Dynamic Split Computing for Efficient Deep Edge Intelligence
- Authors: Arian Bakhtiarnia, Nemanja Milo\v{s}evi\'c, Qi Zhang, Dragana
Bajovi\'c, Alexandros Iosifidis
- Abstract summary: We introduce dynamic split computing, where the optimal split location is dynamically selected based on the state of the communication channel.
We show that dynamic split computing achieves faster inference in edge computing environments where the data rate and server load vary over time.
- Score: 78.4233915447056
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deploying deep neural networks (DNNs) on IoT and mobile devices is a
challenging task due to their limited computational resources. Thus, demanding
tasks are often entirely offloaded to edge servers which can accelerate
inference, however, it also causes communication cost and evokes privacy
concerns. In addition, this approach leaves the computational capacity of end
devices unused. Split computing is a paradigm where a DNN is split into two
sections; the first section is executed on the end device, and the output is
transmitted to the edge server where the final section is executed. Here, we
introduce dynamic split computing, where the optimal split location is
dynamically selected based on the state of the communication channel. By using
natural bottlenecks that already exist in modern DNN architectures, dynamic
split computing avoids retraining and hyperparameter optimization, and does not
have any negative impact on the final accuracy of DNNs. Through extensive
experiments, we show that dynamic split computing achieves faster inference in
edge computing environments where the data rate and server load vary over time.
Related papers
- I-SplitEE: Image classification in Split Computing DNNs with Early Exits [5.402030962296633]
Large size of Deep Neural Networks (DNNs) hinders deploying them on resource-constrained devices like edge, mobile, and IoT platforms.
Our work presents an innovative unified approach merging early exits and split computing.
I-SplitEE is an online unsupervised algorithm ideal for scenarios lacking ground truths and with sequential data.
arXiv Detail & Related papers (2024-01-19T07:44:32Z) - A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical
Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs)
MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - A Low-Complexity Approach to Rate-Distortion Optimized Variable Bit-Rate
Compression for Split DNN Computing [5.3221129103999125]
Split computing has emerged as a recent paradigm for implementation of DNN-based AI workloads.
We present an approach that addresses the challenge of optimizing the rate-accuracy-complexity trade-off.
Our approach is remarkably lightweight, both during training and inference, highly effective and achieves excellent rate-distortion performance.
arXiv Detail & Related papers (2022-08-24T15:02:11Z) - Receptive Field-based Segmentation for Distributed CNN Inference
Acceleration in Collaborative Edge Computing [93.67044879636093]
We study inference acceleration using distributed convolutional neural networks (CNNs) in collaborative edge computing network.
We propose a novel collaborative edge computing using fused-layer parallelization to partition a CNN model into multiple blocks of convolutional layers.
arXiv Detail & Related papers (2022-07-22T18:38:11Z) - An Adaptive Device-Edge Co-Inference Framework Based on Soft
Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices.
We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations.
Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z) - Dynamic DNN Decomposition for Lossless Synergistic Inference [0.9549013615433989]
Deep neural networks (DNNs) sustain high performance in today's data processing applications.
We propose D3, a dynamic DNN decomposition system for synergistic inference without precision loss.
D3 outperforms the state-of-the-art counterparts up to 3.4 times in end-to-end DNN inference time and reduces backbone network communication overhead up to 3.68 times.
arXiv Detail & Related papers (2021-01-15T03:18:53Z) - Neural Compression and Filtering for Edge-assisted Real-time Object
Detection in Challenged Networks [8.291242737118482]
We focus on edge computing supporting remote object detection by means of Deep Neural Networks (DNNs)
We develop a framework to reduce the amount of data transmitted over the wireless link.
The proposed technique represents an effective intermediate option between local and edge computing in a parameter region.
arXiv Detail & Related papers (2020-07-31T03:11:46Z) - PatDNN: Achieving Real-Time DNN Execution on Mobile Devices with
Pattern-based Weight Pruning [57.20262984116752]
We introduce a new dimension, fine-grained pruning patterns inside the coarse-grained structures, revealing a previously unknown point in design space.
With the higher accuracy enabled by fine-grained pruning patterns, the unique insight is to use the compiler to re-gain and guarantee high hardware efficiency.
arXiv Detail & Related papers (2020-01-01T04:52:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.