Scission: Performance-driven and Context-aware Cloud-Edge Distribution
of Deep Neural Networks
- URL: http://arxiv.org/abs/2008.03523v2
- Date: Wed, 16 Dec 2020 19:45:55 GMT
- Title: Scission: Performance-driven and Context-aware Cloud-Edge Distribution
of Deep Neural Networks
- Authors: Luke Lockhart and Paul Harvey and Pierre Imai and Peter Willis and
Blesson Varghese
- Abstract summary: This paper presents Scission, a tool for automated benchmarking of deep neural networks (DNNs) on a set of target device, edge and cloud resources.
The decision-making approach is context-aware by capitalizing on hardware capabilities of the target resources.
The benchmarking overheads of Scission allow for responding to operational changes periodically rather than in real-time.
- Score: 1.2949520455740093
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Partitioning and distributing deep neural networks (DNNs) across end-devices,
edge resources and the cloud has a potential twofold advantage: preserving
privacy of the input data, and reducing the ingress bandwidth demand beyond the
edge. However, for a given DNN, identifying the optimal partition configuration
for distributing the DNN that maximizes performance is a significant challenge.
This is because the combination of potential target hardware resources that
maximizes performance and the sequence of layers of the DNN that should be
distributed across the target resources needs to be determined, while
accounting for user-defined objectives/constraints for partitioning. This paper
presents Scission, a tool for automated benchmarking of DNNs on a given set of
target device, edge and cloud resources for determining optimal partitions that
maximize DNN performance. The decision-making approach is context-aware by
capitalizing on hardware capabilities of the target resources, their locality,
the characteristics of DNN layers, and the network condition. Experimental
studies are carried out on 18 DNNs. The decisions made by Scission cannot be
manually made by a human given the complexity and the number of dimensions
affecting the search space. The benchmarking overheads of Scission allow for
responding to operational changes periodically rather than in real-time.
Scission is available for public download at
https://github.com/qub-blesson/Scission.
Related papers
- Unveiling the Power of Sparse Neural Networks for Feature Selection [60.50319755984697]
Sparse Neural Networks (SNNs) have emerged as powerful tools for efficient feature selection.
We show that SNNs trained with dynamic sparse training (DST) algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction.
Our findings show that feature selection with SNNs trained with DST algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction.
arXiv Detail & Related papers (2024-08-08T16:48:33Z) - DNN Partitioning, Task Offloading, and Resource Allocation in Dynamic Vehicular Networks: A Lyapunov-Guided Diffusion-Based Reinforcement Learning Approach [49.56404236394601]
We formulate the problem of joint DNN partitioning, task offloading, and resource allocation in Vehicular Edge Computing.
Our objective is to minimize the DNN-based task completion time while guaranteeing the system stability over time.
We propose a Multi-Agent Diffusion-based Deep Reinforcement Learning (MAD2RL) algorithm, incorporating the innovative use of diffusion models.
arXiv Detail & Related papers (2024-06-11T06:31:03Z) - Combining Multi-Objective Bayesian Optimization with Reinforcement Learning for TinyML [4.2019872499238256]
We propose a novel strategy for deploying Deep Neural Networks on microcontrollers (TinyML) based on Multi-Objective Bayesian optimization (MOBOpt)
Our methodology aims at efficiently finding tradeoffs between a DNN's predictive accuracy, memory consumption on a given target system, and computational complexity.
arXiv Detail & Related papers (2023-05-23T14:31:52Z) - A Survey on Deep Neural Network Partition over Cloud, Edge and End
Devices [6.248548718574856]
Deep neural network (DNN) partition is a research problem that involves splitting a DNN into multiple parts and offloading them to specific locations.
This paper provides a comprehensive survey on the recent advances and challenges in DNN partition approaches over the cloud, edge, and end devices.
arXiv Detail & Related papers (2023-04-20T00:17:27Z) - DeepAxe: A Framework for Exploration of Approximation and Reliability
Trade-offs in DNN Accelerators [0.9556128246747769]
The role of Deep Neural Networks (DNNs) in safety-critical applications is expanding.
DNNs experience massive growth in terms of computation power.
It raises the necessity of improving the reliability of DNN accelerators.
arXiv Detail & Related papers (2023-03-14T20:42:38Z) - Dynamic Split Computing for Efficient Deep Edge Intelligence [78.4233915447056]
We introduce dynamic split computing, where the optimal split location is dynamically selected based on the state of the communication channel.
We show that dynamic split computing achieves faster inference in edge computing environments where the data rate and server load vary over time.
arXiv Detail & Related papers (2022-05-23T12:35:18Z) - Hybrid SNN-ANN: Energy-Efficient Classification and Object Detection for
Event-Based Vision [64.71260357476602]
Event-based vision sensors encode local pixel-wise brightness changes in streams of events rather than image frames.
Recent progress in object recognition from event-based sensors has come from conversions of deep neural networks.
We propose a hybrid architecture for end-to-end training of deep neural networks for event-based pattern recognition and object detection.
arXiv Detail & Related papers (2021-12-06T23:45:58Z) - Dynamic DNN Decomposition for Lossless Synergistic Inference [0.9549013615433989]
Deep neural networks (DNNs) sustain high performance in today's data processing applications.
We propose D3, a dynamic DNN decomposition system for synergistic inference without precision loss.
D3 outperforms the state-of-the-art counterparts up to 3.4 times in end-to-end DNN inference time and reduces backbone network communication overhead up to 3.68 times.
arXiv Detail & Related papers (2021-01-15T03:18:53Z) - A Case For Adaptive Deep Neural Networks in Edge Computing [1.683310745678261]
This paper investigates whether there is a case for adaptive Deep Neural Networks (DNNs) in edge computing.
The results show that network conditions affects DNN performance more than CPU or memory related operational conditions.
arXiv Detail & Related papers (2020-08-04T20:23:50Z) - Resource Allocation via Graph Neural Networks in Free Space Optical
Fronthaul Networks [119.81868223344173]
This paper investigates the optimal resource allocation in free space optical (FSO) fronthaul networks.
We consider the graph neural network (GNN) for the policy parameterization to exploit the FSO network structure.
The primal-dual learning algorithm is developed to train the GNN in a model-free manner, where the knowledge of system models is not required.
arXiv Detail & Related papers (2020-06-26T14:20:48Z) - PatDNN: Achieving Real-Time DNN Execution on Mobile Devices with
Pattern-based Weight Pruning [57.20262984116752]
We introduce a new dimension, fine-grained pruning patterns inside the coarse-grained structures, revealing a previously unknown point in design space.
With the higher accuracy enabled by fine-grained pruning patterns, the unique insight is to use the compiler to re-gain and guarantee high hardware efficiency.
arXiv Detail & Related papers (2020-01-01T04:52:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.