SplitPlace: AI Augmented Splitting and Placement of Large-Scale Neural
Networks in Mobile Edge Environments
- URL: http://arxiv.org/abs/2205.10635v1
- Date: Sat, 21 May 2022 16:24:47 GMT
- Title: SplitPlace: AI Augmented Splitting and Placement of Large-Scale Neural
Networks in Mobile Edge Environments
- Authors: Shreshth Tuli and Giuliano Casale and Nicholas R. Jennings
- Abstract summary: This work proposes an AI-driven online policy, SplitPlace, that uses Multi-Armed-Bandits to intelligently decide between layer and semantic splitting strategies.
SplitPlace places such neural network split fragments on mobile edge devices using decision-aware reinforcement learning.
Our experiments show that SplitPlace can significantly improve the state-of-the-art in terms of average response time, deadline violation rate, inference accuracy, and total reward by up to 46, 69, 3 and 12 percent respectively.
- Score: 13.864161788250856
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent years, deep learning models have become ubiquitous in industry and
academia alike. Deep neural networks can solve some of the most complex
pattern-recognition problems today, but come with the price of massive compute
and memory requirements. This makes the problem of deploying such large-scale
neural networks challenging in resource-constrained mobile edge computing
platforms, specifically in mission-critical domains like surveillance and
healthcare. To solve this, a promising solution is to split resource-hungry
neural networks into lightweight disjoint smaller components for pipelined
distributed processing. At present, there are two main approaches to do this:
semantic and layer-wise splitting. The former partitions a neural network into
parallel disjoint models that produce a part of the result, whereas the latter
partitions into sequential models that produce intermediate results. However,
there is no intelligent algorithm that decides which splitting strategy to use
and places such modular splits to edge nodes for optimal performance. To combat
this, this work proposes a novel AI-driven online policy, SplitPlace, that uses
Multi-Armed-Bandits to intelligently decide between layer and semantic
splitting strategies based on the input task's service deadline demands.
SplitPlace places such neural network split fragments on mobile edge devices
using decision-aware reinforcement learning for efficient and scalable
computing. Moreover, SplitPlace fine-tunes its placement engine to adapt to
volatile environments. Our experiments on physical mobile-edge environments
with real-world workloads show that SplitPlace can significantly improve the
state-of-the-art in terms of average response time, deadline violation rate,
inference accuracy, and total reward by up to 46, 69, 3 and 12 percent
respectively.
Related papers
- I-SplitEE: Image classification in Split Computing DNNs with Early Exits [5.402030962296633]
Large size of Deep Neural Networks (DNNs) hinders deploying them on resource-constrained devices like edge, mobile, and IoT platforms.
Our work presents an innovative unified approach merging early exits and split computing.
I-SplitEE is an online unsupervised algorithm ideal for scenarios lacking ground truths and with sequential data.
arXiv Detail & Related papers (2024-01-19T07:44:32Z) - Accelerating Split Federated Learning over Wireless Communication
Networks [17.97006656280742]
We consider a split federated learning (SFL) framework that combines the parallel model training mechanism of federated learning (FL) and the model splitting structure of split learning (SL)
We formulate a joint problem of split point selection and bandwidth allocation to minimize the system latency.
Experiment results demonstrate the superiority of our work in latency reduction and accuracy improvement.
arXiv Detail & Related papers (2023-10-24T07:49:56Z) - Split-Et-Impera: A Framework for the Design of Distributed Deep Learning
Applications [8.434224141580758]
Split-Et-Impera determines the set of the best-split points of a neural network based on deep network interpretability principles.
It performs a communication-aware simulation for the rapid evaluation of different neural network rearrangements.
It suggests the best match between the quality of service requirements of the application and the performance in terms of accuracy and latency time.
arXiv Detail & Related papers (2023-03-22T13:00:00Z) - I-SPLIT: Deep Network Interpretability for Split Computing [11.652957867167098]
This work makes a substantial step in the field of split computing, i.e., how to split a deep neural network to host its early part on an embedded device and the rest on a server.
We show that not only the architecture of the layers does matter, but the importance of the neurons contained therein too.
arXiv Detail & Related papers (2022-09-23T14:26:56Z) - Receptive Field-based Segmentation for Distributed CNN Inference
Acceleration in Collaborative Edge Computing [93.67044879636093]
We study inference acceleration using distributed convolutional neural networks (CNNs) in collaborative edge computing network.
We propose a novel collaborative edge computing using fused-layer parallelization to partition a CNN model into multiple blocks of convolutional layers.
arXiv Detail & Related papers (2022-07-22T18:38:11Z) - Dynamic Split Computing for Efficient Deep Edge Intelligence [78.4233915447056]
We introduce dynamic split computing, where the optimal split location is dynamically selected based on the state of the communication channel.
We show that dynamic split computing achieves faster inference in edge computing environments where the data rate and server load vary over time.
arXiv Detail & Related papers (2022-05-23T12:35:18Z) - An Adaptive Device-Edge Co-Inference Framework Based on Soft
Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices.
We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations.
Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z) - DeepSplit: Scalable Verification of Deep Neural Networks via Operator
Splitting [70.62923754433461]
Analyzing the worst-case performance of deep neural networks against input perturbations amounts to solving a large-scale non- optimization problem.
We propose a novel method that can directly solve a convex relaxation of the problem to high accuracy, by splitting it into smaller subproblems that often have analytical solutions.
arXiv Detail & Related papers (2021-06-16T20:43:49Z) - Solving Mixed Integer Programs Using Neural Networks [57.683491412480635]
This paper applies learning to the two key sub-tasks of a MIP solver, generating a high-quality joint variable assignment, and bounding the gap in objective value between that assignment and an optimal one.
Our approach constructs two corresponding neural network-based components, Neural Diving and Neural Branching, to use in a base MIP solver such as SCIP.
We evaluate our approach on six diverse real-world datasets, including two Google production datasets and MIPLIB, by training separate neural networks on each.
arXiv Detail & Related papers (2020-12-23T09:33:11Z) - Fitting the Search Space of Weight-sharing NAS with Graph Convolutional
Networks [100.14670789581811]
We train a graph convolutional network to fit the performance of sampled sub-networks.
With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
arXiv Detail & Related papers (2020-04-17T19:12:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.