Related papers: Split Computing and Early Exiting for Deep Learning Applications: Survey and Research Challenges

Split Computing and Early Exiting for Deep Learning Applications: Survey and Research Challenges

URL: http://arxiv.org/abs/2103.04505v1
Date: Mon, 8 Mar 2021 01:47:20 GMT
Title: Split Computing and Early Exiting for Deep Learning Applications: Survey and Research Challenges
Authors: Yoshitomo Matsubara, Marco Levorato, Francesco Restuccia
Abstract summary: We provide a comprehensive survey of the state of the art in split computing (SC) and early exiting (EE) strategies. Recent approaches have been proposed, where the deep neural network is split into a head and a tail model, executed respectively on the mobile device and on the edge device. EE trains models to present multiple "exits" earlier in the architecture, each providing increasingly higher target accuracy.
Score: 18.103754866476088
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Mobile devices such as smartphones and autonomous vehicles increasingly rely on deep neural networks (DNNs) to execute complex inference tasks such as image classification and speech recognition, among others. However, continuously executing the entire DNN on the mobile device can quickly deplete its battery. Although task offloading to edge devices may decrease the mobile device's computational burden, erratic patterns in channel quality, network and edge server load can lead to a significant delay in task execution. Recently, approaches based on split computing (SC) have been proposed, where the DNN is split into a head and a tail model, executed respectively on the mobile device and on the edge device. Ultimately, this may reduce bandwidth usage as well as energy consumption. Another approach, called early exiting (EE), trains models to present multiple "exits" earlier in the architecture, each providing increasingly higher target accuracy. Therefore, the trade-off between accuracy and delay can be tuned according to the current conditions or application demands. In this paper, we provide a comprehensive survey of the state of the art in SC and EE strategies, by presenting a comparison of the most relevant approaches. We conclude the paper by providing a set of compelling research challenges.

Related papers

Enhancing Split Computing and Early Exit Applications through Predefined Sparsity [12.293736644405937]
Deep Neural Networks (DNNs) achieved state-of-the-art performance in a broad range of problems, spanning from object classification and action recognition to smart building and healthcare. The flexibility that makes DNNs such a pervasive technology comes at a price: the computational requirements preclude their deployment on resource-constrained edge devices. This paper introduces a novel approach to address this challenge by combining the concept of predefined sparsity with Split Computing (SC) and Early Exit (EE) Experimental results showcase reductions exceeding 4x in storage and computational complexity without compromising performance.
arXiv Detail & Related papers (2024-07-16T14:24:04Z)
AccEPT: An Acceleration Scheme for Speeding Up Edge Pipeline-parallel Training [22.107070114339038]
We propose AccEPT, an acceleration scheme for accelerating the edge collaborative pipeline-parallel training. In particular, we propose a light-weight adaptive latency predictor to accurately estimate the latency of each layer at different devices. Our numerical results demonstrate that our proposed acceleration approach is able to significantly speed up edge pipeline parallel training up to 3 times faster.
arXiv Detail & Related papers (2023-11-10T02:18:33Z)
Slimmable Encoders for Flexible Split DNNs in Bandwidth and Resource Constrained IoT Systems [12.427821850039448]
We propose a novel split computing approach based on slimmable ensemble encoders. The key advantage of our design is the ability to adapt computational load and transmitted data size in real-time with minimal overhead and time. Our model outperforms existing solutions in terms of compression efficacy and execution time, especially in the context of weak mobile devices.
arXiv Detail & Related papers (2023-06-22T06:33:12Z)
Fluid Batching: Exit-Aware Preemptive Serving of Early-Exit Neural Networks on Edge NPUs [74.83613252825754]
"smart ecosystems" are being formed where sensing happens concurrently rather than standalone. This is shifting the on-device inference paradigm towards deploying neural processing units (NPUs) at the edge. We propose a novel early-exit scheduling that allows preemption at run time to account for the dynamicity introduced by the arrival and exiting processes.
arXiv Detail & Related papers (2022-09-27T15:04:01Z)
An Adaptive Device-Edge Co-Inference Framework Based on Soft Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices. We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations. Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z)
Computational Intelligence and Deep Learning for Next-Generation Edge-Enabled Industrial IoT [51.68933585002123]
We investigate how to deploy computational intelligence and deep learning (DL) in edge-enabled industrial IoT networks. In this paper, we propose a novel multi-exit-based federated edge learning (ME-FEEL) framework. In particular, the proposed ME-FEEL can achieve an accuracy gain up to 32.7% in the industrial IoT networks with the severely limited resources.
arXiv Detail & Related papers (2021-10-28T08:14:57Z)
Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC) We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer. Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z)
Energy-Efficient Model Compression and Splitting for Collaborative Inference Over Time-Varying Channels [52.60092598312894]
We propose a technique to reduce the total energy bill at the edge device by utilizing model compression and time-varying model split between the edge and remote nodes. Our proposed solution results in minimal energy consumption and $CO$ emission compared to the considered baselines.
arXiv Detail & Related papers (2021-06-02T07:36:27Z)
HAPI: Hardware-Aware Progressive Inference [18.214367595727037]
Convolutional neural networks (CNNs) have recently become the state-of-the-art in a diversity of AI tasks. Despite their popularity, CNN inference still comes at a high computational cost. This work presents HAPI, a novel methodology for generating high-performance early-exit networks.
arXiv Detail & Related papers (2020-08-10T09:55:18Z)
Adaptive Anomaly Detection for IoT Data in Hierarchical Edge Computing [71.86955275376604]
We propose an adaptive anomaly detection approach for hierarchical edge computing (HEC) systems to solve this problem. We design an adaptive scheme to select one of the models based on the contextual information extracted from input data, to perform anomaly detection. We evaluate our proposed approach using a real IoT dataset, and demonstrate that it reduces detection delay by 84% while maintaining almost the same accuracy as compared to offloading detection tasks to the cloud.
arXiv Detail & Related papers (2020-01-10T05:29:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.