Related papers: ESAI: Efficient Split Artificial Intelligence via Early Exiting Using Neural Architecture Search

ESAI: Efficient Split Artificial Intelligence via Early Exiting Using Neural Architecture Search

URL: http://arxiv.org/abs/2106.12549v1
Date: Mon, 21 Jun 2021 04:47:53 GMT
Title: ESAI: Efficient Split Artificial Intelligence via Early Exiting Using Neural Architecture Search
Authors: Behnam Zeinali, Di Zhuang, J. Morris Chang
Abstract summary: Deep neural networks have been outperforming conventional machine learning algorithms in many computer vision-related tasks. The majority of devices are harnessing the cloud computing methodology in which outstanding deep learning models are responsible for analyzing the data on the server. In this paper, a new framework for deploying on IoT devices has been proposed which can take advantage of both the cloud and the on-device models.
Score: 6.316693022958222
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recently, deep neural networks have been outperforming conventional machine learning algorithms in many computer vision-related tasks. However, it is not computationally acceptable to implement these models on mobile and IoT devices and the majority of devices are harnessing the cloud computing methodology in which outstanding deep learning models are responsible for analyzing the data on the server. This can bring the communication cost for the devices and make the whole system useless in those times where the communication is not available. In this paper, a new framework for deploying on IoT devices has been proposed which can take advantage of both the cloud and the on-device models by extracting the meta-information from each sample's classification result and evaluating the classification's performance for the necessity of sending the sample to the server. Experimental results show that only 40 percent of the test data should be sent to the server using this technique and the overall accuracy of the framework is 92 percent which improves the accuracy of both client and server models.

Related papers

Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
Knowledge Transfer For On-Device Speech Emotion Recognition with Neural Structured Learning [19.220263739291685]
Speech emotion recognition (SER) has been a popular research topic in human-computer interaction (HCI) We propose a neural structured learning (NSL) framework through building synthesized graphs. Our experiments demonstrate that training a lightweight SER model on the target dataset with speech samples and graphs can not only produce small SER models, but also enhance the model performance.
arXiv Detail & Related papers (2022-10-26T18:38:42Z)
Incremental Online Learning Algorithms Comparison for Gesture and Visual Smart Sensors [68.8204255655161]
This paper compares four state-of-the-art algorithms in two real applications: gesture recognition based on accelerometer data and image classification. Our results confirm these systems' reliability and the feasibility of deploying them in tiny-memory MCUs.
arXiv Detail & Related papers (2022-09-01T17:05:20Z)
Optimising Communication Overhead in Federated Learning Using NSGA-II [6.635754265968436]
This work aims at optimising communication overhead in federated learning by (I) modelling it as a multi-objective problem and (II) applying a multi-objective optimization algorithm (NSGA-II) to solve it. Experiments have shown that our proposal could reduce communication by 99% and maintain an accuracy equal to the one obtained by the FedAvg Algorithm that uses 100% of communications.
arXiv Detail & Related papers (2022-04-01T18:06:20Z)
SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines. This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z)
Cost-effective Machine Learning Inference Offload for Edge Computing [0.3149883354098941]
This paper proposes a novel offloading mechanism by leveraging installed-base on-premises (edge) computational resources. The proposed mechanism allows the edge devices to offload heavy and compute-intensive workloads to edge nodes instead of using remote cloud.
arXiv Detail & Related papers (2020-12-07T21:11:02Z)
The Case for Retraining of ML Models for IoT Device Identification at the Edge [0.026215338446228163]
We show how to identify IoT devices based on their network behavior using resources available at the edge of the network. It is possible to achieve device identification and categorization with over 80% and 90% accuracy respectively at the edge.
arXiv Detail & Related papers (2020-11-17T13:01:04Z)
MS-RANAS: Multi-Scale Resource-Aware Neural Architecture Search [94.80212602202518]
We propose Multi-Scale Resource-Aware Neural Architecture Search (MS-RANAS) We employ a one-shot architecture search approach in order to obtain a reduced search cost. We achieve state-of-the-art results in terms of accuracy-speed trade-off.
arXiv Detail & Related papers (2020-09-29T11:56:01Z)
Fast-Convergent Federated Learning [82.32029953209542]
Federated learning is a promising solution for distributing machine learning tasks through modern networks of mobile devices. We propose a fast-convergent federated learning algorithm, called FOLB, which performs intelligent sampling of devices in each round of model training.
arXiv Detail & Related papers (2020-07-26T14:37:51Z)
Prune2Edge: A Multi-Phase Pruning Pipelines to Deep Ensemble Learning in IIoT [0.0]
We propose a novel edge-based multi-phase pruning pipelines to ensemble learning on IIoT devices. In the first phase, we generate a diverse ensemble of pruned models, then we apply integer quantisation, next we prune the generated ensemble using a clustering-based technique. Our proposed approach was able to outperform the predictability levels of a baseline model.
arXiv Detail & Related papers (2020-04-09T17:44:34Z)
Deep Learning for Ultra-Reliable and Low-Latency Communications in 6G Networks [84.2155885234293]
We first summarize how to apply data-driven supervised deep learning and deep reinforcement learning in URLLC. To address these open problems, we develop a multi-level architecture that enables device intelligence, edge intelligence, and cloud intelligence for URLLC.
arXiv Detail & Related papers (2020-02-22T14:38:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.