Related papers: A Survey on Deep Neural Network Partition over Cloud, Edge and End Devices

A Survey on Deep Neural Network Partition over Cloud, Edge and End Devices

URL: http://arxiv.org/abs/2304.10020v1
Date: Thu, 20 Apr 2023 00:17:27 GMT
Title: A Survey on Deep Neural Network Partition over Cloud, Edge and End Devices
Authors: Di Xu, Xiang He, Tonghua Su, Zhongjie Wang
Abstract summary: Deep neural network (DNN) partition is a research problem that involves splitting a DNN into multiple parts and offloading them to specific locations. This paper provides a comprehensive survey on the recent advances and challenges in DNN partition approaches over the cloud, edge, and end devices.
Score: 6.248548718574856
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep neural network (DNN) partition is a research problem that involves splitting a DNN into multiple parts and offloading them to specific locations. Because of the recent advancement in multi-access edge computing and edge intelligence, DNN partition has been considered as a powerful tool for improving DNN inference performance when the computing resources of edge and end devices are limited and the remote transmission of data from these devices to clouds is costly. This paper provides a comprehensive survey on the recent advances and challenges in DNN partition approaches over the cloud, edge, and end devices based on a detailed literature collection. We review how DNN partition works in various application scenarios, and provide a unified mathematical model of the DNN partition problem. We developed a five-dimensional classification framework for DNN partition approaches, consisting of deployment locations, partition granularity, partition constraints, optimization objectives, and optimization algorithms. Each existing DNN partition approache can be perfectly defined in this framework by instantiating each dimension into specific values. In addition, we suggest a set of metrics for comparing and evaluating the DNN partition approaches. Based on this, we identify and discuss research challenges that have not yet been investigated or fully addressed. We hope that this work helps DNN partition researchers by highlighting significant future research directions in this domain.

Related papers

DNN Partitioning, Task Offloading, and Resource Allocation in Dynamic Vehicular Networks: A Lyapunov-Guided Diffusion-Based Reinforcement Learning Approach [49.56404236394601]
We formulate the problem of joint DNN partitioning, task offloading, and resource allocation in Vehicular Edge Computing. Our objective is to minimize the DNN-based task completion time while guaranteeing the system stability over time. We propose a Multi-Agent Diffusion-based Deep Reinforcement Learning (MAD2RL) algorithm, incorporating the innovative use of diffusion models.
arXiv Detail & Related papers (2024-06-11T06:31:03Z)
Salted Inference: Enhancing Privacy while Maintaining Efficiency of Split Inference in Mobile Computing [8.915849482780631]
In split inference, a deep neural network (DNN) is partitioned to run the early part of the DNN at the edge and the later part of the DNN in the cloud. This meets two key requirements for on-device machine learning: input privacy and computation efficiency. We introduce Salted DNNs: a novel approach that enables clients at the edge, who run the early part of the DNN, to control the semantic interpretation of the DNN's outputs at inference time.
arXiv Detail & Related papers (2023-10-20T09:53:55Z)
Sparse-DySta: Sparsity-Aware Dynamic and Static Scheduling for Sparse Multi-DNN Workloads [65.47816359465155]
Running multiple deep neural networks (DNNs) in parallel has become an emerging workload in both edge devices. We propose Dysta, a novel scheduler that utilizes both static sparsity patterns and dynamic sparsity information for the sparse multi-DNN scheduling. Our proposed approach outperforms the state-of-the-art methods with up to 10% decrease in latency constraint violation rate and nearly 4X reduction in average normalized turnaround time.
arXiv Detail & Related papers (2023-10-17T09:25:17Z)
Number Systems for Deep Neural Network Architectures: A Survey [1.4260605984981944]
Deep neural networks (DNNs) have become an enabling component for a myriad of artificial intelligence applications. DNNs have shown sometimes superior performance, even compared to humans, in cases such as self-driving, health applications, etc.
arXiv Detail & Related papers (2023-07-11T06:19:25Z)
The #DNN-Verification Problem: Counting Unsafe Inputs for Deep Neural Networks [94.63547069706459]
#DNN-Verification problem involves counting the number of input configurations of a DNN that result in a violation of a safety property. We propose a novel approach that returns the exact count of violations. We present experimental results on a set of safety-critical benchmarks.
arXiv Detail & Related papers (2023-01-17T18:32:01Z)
Receptive Field-based Segmentation for Distributed CNN Inference Acceleration in Collaborative Edge Computing [93.67044879636093]
We study inference acceleration using distributed convolutional neural networks (CNNs) in collaborative edge computing network. We propose a novel collaborative edge computing using fused-layer parallelization to partition a CNN model into multiple blocks of convolutional layers.
arXiv Detail & Related papers (2022-07-22T18:38:11Z)
Dynamic Split Computing for Efficient Deep Edge Intelligence [78.4233915447056]
We introduce dynamic split computing, where the optimal split location is dynamically selected based on the state of the communication channel. We show that dynamic split computing achieves faster inference in edge computing environments where the data rate and server load vary over time.
arXiv Detail & Related papers (2022-05-23T12:35:18Z)
Dynamic DNN Decomposition for Lossless Synergistic Inference [0.9549013615433989]
Deep neural networks (DNNs) sustain high performance in today's data processing applications. We propose D3, a dynamic DNN decomposition system for synergistic inference without precision loss. D3 outperforms the state-of-the-art counterparts up to 3.4 times in end-to-end DNN inference time and reduces backbone network communication overhead up to 3.68 times.
arXiv Detail & Related papers (2021-01-15T03:18:53Z)
SyReNN: A Tool for Analyzing Deep Neural Networks [8.55884254206878]
Deep Neural Networks (DNNs) are rapidly gaining popularity in a variety of important domains. This paper introduces SyReNN, a tool for understanding and analyzing a DNN by computing its symbolic representation.
arXiv Detail & Related papers (2021-01-09T00:27:23Z)
Scission: Performance-driven and Context-aware Cloud-Edge Distribution of Deep Neural Networks [1.2949520455740093]
This paper presents Scission, a tool for automated benchmarking of deep neural networks (DNNs) on a set of target device, edge and cloud resources. The decision-making approach is context-aware by capitalizing on hardware capabilities of the target resources. The benchmarking overheads of Scission allow for responding to operational changes periodically rather than in real-time.
arXiv Detail & Related papers (2020-08-08T13:39:57Z)
Boosting Deep Neural Networks with Geometrical Prior Knowledge: A Survey [77.99182201815763]
Deep Neural Networks (DNNs) achieve state-of-the-art results in many different problem settings. DNNs are often treated as black box systems, which complicates their evaluation and validation. One promising field, inspired by the success of convolutional neural networks (CNNs) in computer vision tasks, is to incorporate knowledge about symmetric geometrical transformations.
arXiv Detail & Related papers (2020-06-30T14:56:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.