CoEdge: Cooperative DNN Inference with Adaptive Workload Partitioning
over Heterogeneous Edge Devices
- URL: http://arxiv.org/abs/2012.03257v1
- Date: Sun, 6 Dec 2020 13:15:52 GMT
- Title: CoEdge: Cooperative DNN Inference with Adaptive Workload Partitioning
over Heterogeneous Edge Devices
- Authors: Liekang Zeng, Xu Chen, Zhi Zhou, Lei Yang, Junshan Zhang
- Abstract summary: CoEdge is a distributed Deep Neural Network (DNN) computing system that orchestrates cooperative inference over heterogeneous edge devices.
CoEdge saves energy with close inference latency, achieving up to 25.5%66.9% energy reduction for four widely-adopted CNN models.
- Score: 39.09319776243573
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in artificial intelligence have driven increasing intelligent
applications at the network edge, such as smart home, smart factory, and smart
city. To deploy computationally intensive Deep Neural Networks (DNNs) on
resource-constrained edge devices, traditional approaches have relied on either
offloading workload to the remote cloud or optimizing computation at the end
device locally. However, the cloud-assisted approaches suffer from the
unreliable and delay-significant wide-area network, and the local computing
approaches are limited by the constrained computing capability. Towards
high-performance edge intelligence, the cooperative execution mechanism offers
a new paradigm, which has attracted growing research interest recently. In this
paper, we propose CoEdge, a distributed DNN computing system that orchestrates
cooperative DNN inference over heterogeneous edge devices. CoEdge utilizes
available computation and communication resources at the edge and dynamically
partitions the DNN inference workload adaptive to devices' computing
capabilities and network conditions. Experimental evaluations based on a
realistic prototype show that CoEdge outperforms status-quo approaches in
saving energy with close inference latency, achieving up to 25.5%~66.9% energy
reduction for four widely-adopted CNN models.
Related papers
- SpikeBottleNet: Spike-Driven Feature Compression Architecture for Edge-Cloud Co-Inference [0.86325068644655]
We propose SpikeBottleNet, a novel architecture for edge-cloud co-inference systems.
SpikeBottleNet integrates a spiking neuron model to significantly reduce energy consumption on edge devices.
arXiv Detail & Related papers (2024-10-11T09:59:21Z) - DNN Partitioning, Task Offloading, and Resource Allocation in Dynamic Vehicular Networks: A Lyapunov-Guided Diffusion-Based Reinforcement Learning Approach [49.56404236394601]
We formulate the problem of joint DNN partitioning, task offloading, and resource allocation in Vehicular Edge Computing.
Our objective is to minimize the DNN-based task completion time while guaranteeing the system stability over time.
We propose a Multi-Agent Diffusion-based Deep Reinforcement Learning (MAD2RL) algorithm, incorporating the innovative use of diffusion models.
arXiv Detail & Related papers (2024-06-11T06:31:03Z) - Decentralized Low-Latency Collaborative Inference via Ensembles on the
Edge [28.61344039233783]
We propose to facilitate the application of deep neural networks (DNNs) on the edge by allowing multiple users to collaborate during inference to improve their accuracy.
Our mechanism, coined em edge ensembles, is based on having diverse predictors at each device, which form an ensemble of models during inference.
We analyze the latency induced by edge ensembles, showing that its performance improvement comes at the cost of a minor additional delay under common assumptions on the communication network.
arXiv Detail & Related papers (2022-06-07T10:24:20Z) - Dynamic Split Computing for Efficient Deep Edge Intelligence [78.4233915447056]
We introduce dynamic split computing, where the optimal split location is dynamically selected based on the state of the communication channel.
We show that dynamic split computing achieves faster inference in edge computing environments where the data rate and server load vary over time.
arXiv Detail & Related papers (2022-05-23T12:35:18Z) - An Adaptive Device-Edge Co-Inference Framework Based on Soft
Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices.
We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations.
Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z) - EffCNet: An Efficient CondenseNet for Image Classification on NXP
BlueBox [0.0]
Edge devices offer limited processing power due to their inexpensive hardware, and limited cooling and computational resources.
We propose a novel deep convolutional neural network architecture called EffCNet for edge devices.
arXiv Detail & Related papers (2021-11-28T21:32:31Z) - Computational Intelligence and Deep Learning for Next-Generation
Edge-Enabled Industrial IoT [51.68933585002123]
We investigate how to deploy computational intelligence and deep learning (DL) in edge-enabled industrial IoT networks.
In this paper, we propose a novel multi-exit-based federated edge learning (ME-FEEL) framework.
In particular, the proposed ME-FEEL can achieve an accuracy gain up to 32.7% in the industrial IoT networks with the severely limited resources.
arXiv Detail & Related papers (2021-10-28T08:14:57Z) - Latency-Memory Optimized Splitting of Convolution Neural Networks for
Resource Constrained Edge Devices [1.6873748786804317]
We argue that running CNNs between an edge device and the cloud is synonymous to solving a resource-constrained optimization problem.
Experiments done on real-world edge devices show that, LMOS ensures feasible execution of different CNN models at the edge.
arXiv Detail & Related papers (2021-07-19T19:39:56Z) - Towards AIOps in Edge Computing Environments [60.27785717687999]
This paper describes the system design of an AIOps platform which is applicable in heterogeneous, distributed environments.
It is feasible to collect metrics with a high frequency and simultaneously run specific anomaly detection algorithms directly on edge devices.
arXiv Detail & Related papers (2021-02-12T09:33:00Z) - Joint Multi-User DNN Partitioning and Computational Resource Allocation
for Collaborative Edge Intelligence [21.55340197267767]
Mobile Edge Computing (MEC) has emerged as a promising supporting architecture providing a variety of resources to the network edge.
With the assistance of edge servers, user equipments (UEs) are able to run deep neural network (DNN) based AI applications.
We propose an algorithm called Iterative Alternating Optimization (IAO) that can achieve the optimal solution in time.
arXiv Detail & Related papers (2020-07-15T09:40:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.