Cost-effective Machine Learning Inference Offload for Edge Computing
- URL: http://arxiv.org/abs/2012.04063v1
- Date: Mon, 7 Dec 2020 21:11:02 GMT
- Title: Cost-effective Machine Learning Inference Offload for Edge Computing
- Authors: Christian Makaya, Amalendu Iyer, Jonathan Salfity, Madhu Athreya, M
Anthony Lewis
- Abstract summary: This paper proposes a novel offloading mechanism by leveraging installed-base on-premises (edge) computational resources.
The proposed mechanism allows the edge devices to offload heavy and compute-intensive workloads to edge nodes instead of using remote cloud.
- Score: 0.3149883354098941
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Computing at the edge is increasingly important since a massive amount of
data is generated. This poses challenges in transporting all that data to the
remote data centers and cloud, where they can be processed and analyzed. On the
other hand, harnessing the edge data is essential for offering data-driven and
machine learning-based applications, if the challenges, such as device
capabilities, connectivity, and heterogeneity can be mitigated. Machine
learning applications are very compute-intensive and require processing of
large amount of data. However, edge devices are often resources-constrained, in
terms of compute resources, power, storage, and network connectivity. Hence,
limiting their potential to run efficiently and accurately state-of-the art
deep neural network (DNN) models, which are becoming larger and more complex.
This paper proposes a novel offloading mechanism by leveraging installed-base
on-premises (edge) computational resources. The proposed mechanism allows the
edge devices to offload heavy and compute-intensive workloads to edge nodes
instead of using remote cloud. Our offloading mechanism has been prototyped and
tested with state-of-the art person and object detection DNN models for mobile
robots and video surveillance applications. The performance shows a significant
gain compared to cloud-based offloading strategies in terms of accuracy and
latency.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Driving Intelligent IoT Monitoring and Control through Cloud Computing and Machine Learning [3.134387323162717]
This article explores how to drive intelligent iot monitoring and control through cloud computing and machine learning.
The paper also introduces the development of iot monitoring and control technology, the application of edge computing in iot monitoring and control, and the role of machine learning in data analysis and fault detection.
arXiv Detail & Related papers (2024-03-26T20:59:48Z) - Slimmable Encoders for Flexible Split DNNs in Bandwidth and Resource
Constrained IoT Systems [12.427821850039448]
We propose a novel split computing approach based on slimmable ensemble encoders.
The key advantage of our design is the ability to adapt computational load and transmitted data size in real-time with minimal overhead and time.
Our model outperforms existing solutions in terms of compression efficacy and execution time, especially in the context of weak mobile devices.
arXiv Detail & Related papers (2023-06-22T06:33:12Z) - The MIT Supercloud Workload Classification Challenge [10.458111248130944]
In this paper, we present a workload classification challenge based on the MIT Supercloud dataset.
The goal of this challenge is to foster algorithmic innovations in the analysis of compute workloads.
arXiv Detail & Related papers (2022-04-12T14:28:04Z) - SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines.
This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z) - EffCNet: An Efficient CondenseNet for Image Classification on NXP
BlueBox [0.0]
Edge devices offer limited processing power due to their inexpensive hardware, and limited cooling and computational resources.
We propose a novel deep convolutional neural network architecture called EffCNet for edge devices.
arXiv Detail & Related papers (2021-11-28T21:32:31Z) - Computational Intelligence and Deep Learning for Next-Generation
Edge-Enabled Industrial IoT [51.68933585002123]
We investigate how to deploy computational intelligence and deep learning (DL) in edge-enabled industrial IoT networks.
In this paper, we propose a novel multi-exit-based federated edge learning (ME-FEEL) framework.
In particular, the proposed ME-FEEL can achieve an accuracy gain up to 32.7% in the industrial IoT networks with the severely limited resources.
arXiv Detail & Related papers (2021-10-28T08:14:57Z) - Complexity-aware Adaptive Training and Inference for Edge-Cloud
Distributed AI Systems [9.273593723275544]
IoT and machine learning applications create large amounts of data that require real-time processing.
We propose a distributed AI system to exploit both the edge and the cloud for training and inference.
arXiv Detail & Related papers (2021-09-14T05:03:54Z) - Auto-Split: A General Framework of Collaborative Edge-Cloud AI [49.750972428032355]
This paper describes the techniques and engineering practice behind Auto-Split, an edge-cloud collaborative prototype of Huawei Cloud.
To the best of our knowledge, there is no existing industry product that provides the capability of Deep Neural Network (DNN) splitting.
arXiv Detail & Related papers (2021-08-30T08:03:29Z) - Towards AIOps in Edge Computing Environments [60.27785717687999]
This paper describes the system design of an AIOps platform which is applicable in heterogeneous, distributed environments.
It is feasible to collect metrics with a high frequency and simultaneously run specific anomaly detection algorithms directly on edge devices.
arXiv Detail & Related papers (2021-02-12T09:33:00Z) - Deep Learning for Ultra-Reliable and Low-Latency Communications in 6G
Networks [84.2155885234293]
We first summarize how to apply data-driven supervised deep learning and deep reinforcement learning in URLLC.
To address these open problems, we develop a multi-level architecture that enables device intelligence, edge intelligence, and cloud intelligence for URLLC.
arXiv Detail & Related papers (2020-02-22T14:38:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.