Auto-Split: A General Framework of Collaborative Edge-Cloud AI
- URL: http://arxiv.org/abs/2108.13041v1
- Date: Mon, 30 Aug 2021 08:03:29 GMT
- Title: Auto-Split: A General Framework of Collaborative Edge-Cloud AI
- Authors: Amin Banitalebi-Dehkordi, Naveen Vedula, Jian Pei, Fei Xia, Lanjun
Wang, Yong Zhang
- Abstract summary: This paper describes the techniques and engineering practice behind Auto-Split, an edge-cloud collaborative prototype of Huawei Cloud.
To the best of our knowledge, there is no existing industry product that provides the capability of Deep Neural Network (DNN) splitting.
- Score: 49.750972428032355
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In many industry scale applications, large and resource consuming machine
learning models reside in powerful cloud servers. At the same time, large
amounts of input data are collected at the edge of cloud. The inference results
are also communicated to users or passed to downstream tasks at the edge. The
edge often consists of a large number of low-power devices. It is a big
challenge to design industry products to support sophisticated deep model
deployment and conduct model inference in an efficient manner so that the model
accuracy remains high and the end-to-end latency is kept low. This paper
describes the techniques and engineering practice behind Auto-Split, an
edge-cloud collaborative prototype of Huawei Cloud. This patented technology is
already validated on selected applications, is on its way for broader
systematic edge-cloud application integration, and is being made available for
public use as an automated pipeline service for end-to-end cloud-edge
collaborative intelligence deployment. To the best of our knowledge, there is
no existing industry product that provides the capability of Deep Neural
Network (DNN) splitting.
Related papers
- Integrating Homomorphic Encryption and Trusted Execution Technology for
Autonomous and Confidential Model Refining in Cloud [4.21388107490327]
Homomorphic encryption and trusted execution environment technology can protect confidentiality for autonomous computation.
We propose to integrate these two techniques in the design of the model refining scheme.
arXiv Detail & Related papers (2023-08-02T06:31:41Z) - Design Automation for Fast, Lightweight, and Effective Deep Learning
Models: A Survey [53.258091735278875]
This survey covers studies of design automation techniques for deep learning models targeting edge computing.
It offers an overview and comparison of key metrics that are used commonly to quantify the proficiency of models in terms of effectiveness, lightness, and computational costs.
The survey proceeds to cover three categories of the state-of-the-art of deep model design automation techniques.
arXiv Detail & Related papers (2022-08-22T12:12:43Z) - EffCNet: An Efficient CondenseNet for Image Classification on NXP
BlueBox [0.0]
Edge devices offer limited processing power due to their inexpensive hardware, and limited cooling and computational resources.
We propose a novel deep convolutional neural network architecture called EffCNet for edge devices.
arXiv Detail & Related papers (2021-11-28T21:32:31Z) - Edge-Cloud Polarization and Collaboration: A Comprehensive Survey [61.05059817550049]
We conduct a systematic review for both cloud and edge AI.
We are the first to set up the collaborative learning mechanism for cloud and edge modeling.
We discuss potentials and practical experiences of some on-going advanced edge AI topics.
arXiv Detail & Related papers (2021-11-11T05:58:23Z) - Computational Intelligence and Deep Learning for Next-Generation
Edge-Enabled Industrial IoT [51.68933585002123]
We investigate how to deploy computational intelligence and deep learning (DL) in edge-enabled industrial IoT networks.
In this paper, we propose a novel multi-exit-based federated edge learning (ME-FEEL) framework.
In particular, the proposed ME-FEEL can achieve an accuracy gain up to 32.7% in the industrial IoT networks with the severely limited resources.
arXiv Detail & Related papers (2021-10-28T08:14:57Z) - Complexity-aware Adaptive Training and Inference for Edge-Cloud
Distributed AI Systems [9.273593723275544]
IoT and machine learning applications create large amounts of data that require real-time processing.
We propose a distributed AI system to exploit both the edge and the cloud for training and inference.
arXiv Detail & Related papers (2021-09-14T05:03:54Z) - Device-Cloud Collaborative Learning for Recommendation [50.01289274123047]
We propose a novel MetaPatch learning approach on the device side to efficiently achieve "thousands of people with thousands of models" given a centralized cloud model.
With billions of updated personalized device models, we propose a "model-over-models" distillation algorithm, namely MoMoDistill, to update the centralized cloud model.
arXiv Detail & Related papers (2021-04-14T05:06:59Z) - Efficient Low-Latency Dynamic Licensing for Deep Neural Network
Deployment on Edge Devices [0.0]
We propose an architecture to solve deploying and processing deep neural networks on edge-devices.
Adopting this architecture allows low-latency model updates on devices.
arXiv Detail & Related papers (2021-02-24T09:36:39Z) - Cost-effective Machine Learning Inference Offload for Edge Computing [0.3149883354098941]
This paper proposes a novel offloading mechanism by leveraging installed-base on-premises (edge) computational resources.
The proposed mechanism allows the edge devices to offload heavy and compute-intensive workloads to edge nodes instead of using remote cloud.
arXiv Detail & Related papers (2020-12-07T21:11:02Z) - Deep Learning for Ultra-Reliable and Low-Latency Communications in 6G
Networks [84.2155885234293]
We first summarize how to apply data-driven supervised deep learning and deep reinforcement learning in URLLC.
To address these open problems, we develop a multi-level architecture that enables device intelligence, edge intelligence, and cloud intelligence for URLLC.
arXiv Detail & Related papers (2020-02-22T14:38:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.