Complexity-aware Adaptive Training and Inference for Edge-Cloud
Distributed AI Systems
- URL: http://arxiv.org/abs/2109.06440v1
- Date: Tue, 14 Sep 2021 05:03:54 GMT
- Title: Complexity-aware Adaptive Training and Inference for Edge-Cloud
Distributed AI Systems
- Authors: Yinghan Long, Indranil Chakraborty, Gopalakrishnan Srinivasan, Kaushik
Roy
- Abstract summary: IoT and machine learning applications create large amounts of data that require real-time processing.
We propose a distributed AI system to exploit both the edge and the cloud for training and inference.
- Score: 9.273593723275544
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The ubiquitous use of IoT and machine learning applications is creating large
amounts of data that require accurate and real-time processing. Although
edge-based smart data processing can be enabled by deploying pretrained models,
the energy and memory constraints of edge devices necessitate distributed deep
learning between the edge and the cloud for complex data. In this paper, we
propose a distributed AI system to exploit both the edge and the cloud for
training and inference. We propose a new architecture, MEANet, with a main
block, an extension block, and an adaptive block for the edge. The inference
process can terminate at either the main block, the extension block, or the
cloud. The MEANet is trained to categorize inputs into easy/hard/complex
classes. The main block identifies instances of easy/hard classes and
classifies easy classes with high confidence. Only data with high probabilities
of belonging to hard classes would be sent to the extension block for
prediction. Further, only if the neural network at the edge shows low
confidence in the prediction, the instance is considered complex and sent to
the cloud for further processing. The training technique lends to the majority
of inference on edge devices while going to the cloud only for a small set of
complex jobs, as determined by the edge. The performance of the proposed system
is evaluated via extensive experiments using modified models of ResNets and
MobileNetV2 on CIFAR-100 and ImageNet datasets. The results show that the
proposed distributed model has improved accuracy and energy consumption,
indicating its capacity to adapt.
Related papers
- Leveraging Federated Learning and Edge Computing for Recommendation
Systems within Cloud Computing Networks [3.36271475827981]
Key technology for edge intelligence is the privacy-protecting machine learning paradigm known as Federated Learning (FL), which enables data owners to train models without having to transfer raw data to third-party servers.
To reduce node failures and device exits, a Hierarchical Federated Learning (HFL) framework is proposed, where a designated cluster leader supports the data owner through intermediate model aggregation.
In order to mitigate the impact of soft clicks on the quality of user experience (QoE), the authors model the user QoE as a comprehensive system cost.
arXiv Detail & Related papers (2024-03-05T17:58:26Z) - EdgeConvEns: Convolutional Ensemble Learning for Edge Intelligence [0.0]
Deep edge intelligence aims to deploy deep learning models that demand computationally expensive training in the edge network with limited computational power.
This study proposes a convolutional ensemble learning approach, coined EdgeConvEns, that facilitates training heterogeneous weak models on edge and learning to ensemble them where data on edge are heterogeneously distributed.
arXiv Detail & Related papers (2023-07-25T20:07:32Z) - An Adaptive Device-Edge Co-Inference Framework Based on Soft
Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices.
We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations.
Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z) - SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines.
This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z) - EffCNet: An Efficient CondenseNet for Image Classification on NXP
BlueBox [0.0]
Edge devices offer limited processing power due to their inexpensive hardware, and limited cooling and computational resources.
We propose a novel deep convolutional neural network architecture called EffCNet for edge devices.
arXiv Detail & Related papers (2021-11-28T21:32:31Z) - Computational Intelligence and Deep Learning for Next-Generation
Edge-Enabled Industrial IoT [51.68933585002123]
We investigate how to deploy computational intelligence and deep learning (DL) in edge-enabled industrial IoT networks.
In this paper, we propose a novel multi-exit-based federated edge learning (ME-FEEL) framework.
In particular, the proposed ME-FEEL can achieve an accuracy gain up to 32.7% in the industrial IoT networks with the severely limited resources.
arXiv Detail & Related papers (2021-10-28T08:14:57Z) - Auto-Split: A General Framework of Collaborative Edge-Cloud AI [49.750972428032355]
This paper describes the techniques and engineering practice behind Auto-Split, an edge-cloud collaborative prototype of Huawei Cloud.
To the best of our knowledge, there is no existing industry product that provides the capability of Deep Neural Network (DNN) splitting.
arXiv Detail & Related papers (2021-08-30T08:03:29Z) - Cost-effective Machine Learning Inference Offload for Edge Computing [0.3149883354098941]
This paper proposes a novel offloading mechanism by leveraging installed-base on-premises (edge) computational resources.
The proposed mechanism allows the edge devices to offload heavy and compute-intensive workloads to edge nodes instead of using remote cloud.
arXiv Detail & Related papers (2020-12-07T21:11:02Z) - A Privacy-Preserving Distributed Architecture for
Deep-Learning-as-a-Service [68.84245063902908]
This paper introduces a novel distributed architecture for deep-learning-as-a-service.
It is able to preserve the user sensitive data while providing Cloud-based machine and deep learning services.
arXiv Detail & Related papers (2020-03-30T15:12:03Z) - Deep Learning for Ultra-Reliable and Low-Latency Communications in 6G
Networks [84.2155885234293]
We first summarize how to apply data-driven supervised deep learning and deep reinforcement learning in URLLC.
To address these open problems, we develop a multi-level architecture that enables device intelligence, edge intelligence, and cloud intelligence for URLLC.
arXiv Detail & Related papers (2020-02-22T14:38:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.