Accuracy-Guaranteed Collaborative DNN Inference in Industrial IoT via
Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2301.00130v1
- Date: Sat, 31 Dec 2022 05:53:17 GMT
- Title: Accuracy-Guaranteed Collaborative DNN Inference in Industrial IoT via
Deep Reinforcement Learning
- Authors: Wen Wu, Peng Yang, Weiting Zhang, Conghao Zhou, Xuemin (Sherman) Shen
- Abstract summary: Collaboration among industrial Internet of Things (IoT) devices and edge networks is essential to support computation-intensive deep neural network (DNN) inference services.
In this paper, we investigate the collaborative inference problem in industrial IoT networks.
- Score: 10.223526707269537
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Collaboration among industrial Internet of Things (IoT) devices and edge
networks is essential to support computation-intensive deep neural network
(DNN) inference services which require low delay and high accuracy. Sampling
rate adaption which dynamically configures the sampling rates of industrial IoT
devices according to network conditions, is the key in minimizing the service
delay. In this paper, we investigate the collaborative DNN inference problem in
industrial IoT networks. To capture the channel variation and task arrival
randomness, we formulate the problem as a constrained Markov decision process
(CMDP). Specifically, sampling rate adaption, inference task offloading and
edge computing resource allocation are jointly considered to minimize the
average service delay while guaranteeing the long-term accuracy requirements of
different inference services. Since CMDP cannot be directly solved by general
reinforcement learning (RL) algorithms due to the intractable long-term
constraints, we first transform the CMDP into an MDP by leveraging the Lyapunov
optimization technique. Then, a deep RL-based algorithm is proposed to solve
the MDP. To expedite the training process, an optimization subroutine is
embedded in the proposed algorithm to directly obtain the optimal edge
computing resource allocation. Extensive simulation results are provided to
demonstrate that the proposed RL-based algorithm can significantly reduce the
average service delay while preserving long-term inference accuracy with a high
probability.
Related papers
- Computation Pre-Offloading for MEC-Enabled Vehicular Networks via Trajectory Prediction [38.493882483362135]
We present a Trajectory Prediction-based Pre-offloading Decision (TPPD) algorithm for analyzing the historical trajectories of vehicles.
We devise a dynamic resource allocation algorithm using a Double Deep Q-Network (DDQN) that enables the edge server to minimize task processing delay.
arXiv Detail & Related papers (2024-09-26T09:46:43Z) - DNN Partitioning, Task Offloading, and Resource Allocation in Dynamic Vehicular Networks: A Lyapunov-Guided Diffusion-Based Reinforcement Learning Approach [49.56404236394601]
We formulate the problem of joint DNN partitioning, task offloading, and resource allocation in Vehicular Edge Computing.
Our objective is to minimize the DNN-based task completion time while guaranteeing the system stability over time.
We propose a Multi-Agent Diffusion-based Deep Reinforcement Learning (MAD2RL) algorithm, incorporating the innovative use of diffusion models.
arXiv Detail & Related papers (2024-06-11T06:31:03Z) - A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical
Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs)
MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z) - Training Latency Minimization for Model-Splitting Allowed Federated Edge
Learning [16.8717239856441]
We propose a model-splitting allowed FL (SFL) framework to alleviate the shortage of computing power faced by clients in training deep neural networks (DNNs) using federated learning (FL)
Under the synchronized global update setting, the latency to complete a round of global training is determined by the maximum latency for the clients to complete a local training session.
To solve this mixed integer nonlinear programming problem, we first propose a regression method to fit the quantitative-relationship between the cut-layer and other parameters of an AI-model, and thus, transform the TLMP into a continuous problem.
arXiv Detail & Related papers (2023-07-21T12:26:42Z) - Efficient Parallel Split Learning over Resource-constrained Wireless
Edge Networks [44.37047471448793]
In this paper, we advocate the integration of edge computing paradigm and parallel split learning (PSL)
We propose an innovative PSL framework, namely, efficient parallel split learning (EPSL) to accelerate model training.
We show that the proposed EPSL framework significantly decreases the training latency needed to achieve a target accuracy.
arXiv Detail & Related papers (2023-03-26T16:09:48Z) - Scheduling Inference Workloads on Distributed Edge Clusters with
Reinforcement Learning [11.007816552466952]
This paper focuses on the problem of scheduling inference queries on Deep Neural Networks in edge networks at short timescales.
By means of simulations, we analyze several policies in the realistic network settings and workloads of a large ISP.
We design ASET, a Reinforcement Learning based scheduling algorithm able to adapt its decisions according to the system conditions.
arXiv Detail & Related papers (2023-01-31T13:23:34Z) - Low Complexity Approaches for End-to-End Latency Prediction [0.0]
We focus on end-to-end latency prediction, for which we illustrate our approaches and results on a public dataset from the recent international challenge on GNN.
We propose several low complexity, locally implementable approaches, achieving significantly lower wall time both for training and inference, with marginally worse prediction accuracy compared to state-of-the-art global GNN solutions.
arXiv Detail & Related papers (2023-01-31T10:31:41Z) - Design and Prototyping Distributed CNN Inference Acceleration in Edge
Computing [85.74517957717363]
HALP accelerates inference by designing a seamless collaboration among edge devices (EDs) in Edge Computing.
Experiments show that the distributed inference HALP achieves 1.7x inference acceleration for VGG-16.
It is shown that the model selection with distributed inference HALP can significantly improve service reliability.
arXiv Detail & Related papers (2022-11-24T19:48:30Z) - Predictive GAN-powered Multi-Objective Optimization for Hybrid Federated
Split Learning [56.125720497163684]
We propose a hybrid federated split learning framework in wireless networks.
We design a parallel computing scheme for model splitting without label sharing, and theoretically analyze the influence of the delayed gradient caused by the scheme on the convergence speed.
arXiv Detail & Related papers (2022-09-02T10:29:56Z) - An Adaptive Device-Edge Co-Inference Framework Based on Soft
Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices.
We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations.
Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z) - Computational Intelligence and Deep Learning for Next-Generation
Edge-Enabled Industrial IoT [51.68933585002123]
We investigate how to deploy computational intelligence and deep learning (DL) in edge-enabled industrial IoT networks.
In this paper, we propose a novel multi-exit-based federated edge learning (ME-FEEL) framework.
In particular, the proposed ME-FEEL can achieve an accuracy gain up to 32.7% in the industrial IoT networks with the severely limited resources.
arXiv Detail & Related papers (2021-10-28T08:14:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.