Online Learning for Orchestration of Inference in Multi-User
End-Edge-Cloud Networks
- URL: http://arxiv.org/abs/2202.10541v1
- Date: Mon, 21 Feb 2022 21:41:29 GMT
- Title: Online Learning for Orchestration of Inference in Multi-User
End-Edge-Cloud Networks
- Authors: Sina Shahhosseini, Dongjoo Seo, Anil Kanduri, Tianyi Hu, Sung-soo Lim,
Bryan Donyanavard, Amir M.Rahmani, Nikil Dutt
- Abstract summary: Collaborative end-edge-cloud computing for deep learning provides a range of performance and efficiency.
We propose a reinforcement-learning-based computation offloading solution that learns optimal offloading policy.
Our solution provides 35% speedup in the average response time compared to the state-of-the-art with less than 0.9% accuracy reduction.
- Score: 3.6076391721440633
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Deep-learning-based intelligent services have become prevalent in
cyber-physical applications including smart cities and health-care. Deploying
deep-learning-based intelligence near the end-user enhances privacy protection,
responsiveness, and reliability. Resource-constrained end-devices must be
carefully managed in order to meet the latency and energy requirements of
computationally-intensive deep learning services. Collaborative end-edge-cloud
computing for deep learning provides a range of performance and efficiency that
can address application requirements through computation offloading. The
decision to offload computation is a communication-computation co-optimization
problem that varies with both system parameters (e.g., network condition) and
workload characteristics (e.g., inputs). On the other hand, deep learning model
optimization provides another source of tradeoff between latency and model
accuracy. An end-to-end decision-making solution that considers such
computation-communication problem is required to synergistically find the
optimal offloading policy and model for deep learning services. To this end, we
propose a reinforcement-learning-based computation offloading solution that
learns optimal offloading policy considering deep learning model selection
techniques to minimize response time while providing sufficient accuracy. We
demonstrate the effectiveness of our solution for edge devices in an
end-edge-cloud system and evaluate with a real-setup implementation using
multiple AWS and ARM core configurations. Our solution provides 35% speedup in
the average response time compared to the state-of-the-art with less than 0.9%
accuracy reduction, demonstrating the promise of our online learning framework
for orchestrating DL inference in end-edge-cloud systems.
Related papers
- Slicing for AI: An Online Learning Framework for Network Slicing Supporting AI Services [5.80147190706865]
6G networks will embrace a new realm of AI-driven services that requires innovative network slicing strategies.
This paper proposes an online learning framework to optimize the allocation of computational and communication resources to AI services.
arXiv Detail & Related papers (2024-10-20T14:38:54Z) - DNN Partitioning, Task Offloading, and Resource Allocation in Dynamic Vehicular Networks: A Lyapunov-Guided Diffusion-Based Reinforcement Learning Approach [49.56404236394601]
We formulate the problem of joint DNN partitioning, task offloading, and resource allocation in Vehicular Edge Computing.
Our objective is to minimize the DNN-based task completion time while guaranteeing the system stability over time.
We propose a Multi-Agent Diffusion-based Deep Reinforcement Learning (MAD2RL) algorithm, incorporating the innovative use of diffusion models.
arXiv Detail & Related papers (2024-06-11T06:31:03Z) - A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical
Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs)
MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z) - Time-sensitive Learning for Heterogeneous Federated Edge Intelligence [52.83633954857744]
We investigate real-time machine learning in a federated edge intelligence (FEI) system.
FEI systems exhibit heterogenous communication and computational resource distribution.
We propose a time-sensitive federated learning (TS-FL) framework to minimize the overall run-time for collaboratively training a shared ML model.
arXiv Detail & Related papers (2023-01-26T08:13:22Z) - Hybrid Learning for Orchestrating Deep Learning Inference in Multi-user
Edge-cloud Networks [3.7630209350186807]
Collaborative end-edge-cloud computing for deep learning provides a range of performance and efficiency.
Deep Learning inference orchestration strategy employs reinforcement learning to find the optimal orchestration policy.
We demonstrate efficacy of our HL strategy through experimental comparison with state-of-the-art RL-based inference orchestration.
arXiv Detail & Related papers (2022-02-21T21:50:50Z) - Dynamic Network-Assisted D2D-Aided Coded Distributed Learning [59.29409589861241]
We propose a novel device-to-device (D2D)-aided coded federated learning method (D2D-CFL) for load balancing across devices.
We derive an optimal compression rate for achieving minimum processing time and establish its connection with the convergence time.
Our proposed method is beneficial for real-time collaborative applications, where the users continuously generate training data.
arXiv Detail & Related papers (2021-11-26T18:44:59Z) - Federated Double Deep Q-learning for Joint Delay and Energy Minimization
in IoT networks [12.599009485247283]
We propose a federated deep reinforcement learning framework to solve a multi-objective optimization problem.
To enhance the learning speed of IoT devices (agents), we incorporate federated learning (FDL) at the end of each episode.
Our numerical results demonstrate the efficacy of our proposed federated DDQN framework in terms of learning speed.
arXiv Detail & Related papers (2021-04-02T18:41:59Z) - Multi-agent Reinforcement Learning for Resource Allocation in IoT
networks with Edge Computing [16.129649374251088]
It's challenging for end users to offload computation due to their massive requirements on spectrum and resources.
In this paper, we investigate offloading mechanism with resource allocation in IoT edge computing networks by formulating it as a game.
arXiv Detail & Related papers (2020-04-05T20:59:20Z) - Differentially Private Federated Learning for Resource-Constrained
Internet of Things [24.58409432248375]
Federated learning is capable of analyzing the large amount of data from a distributed set of smart devices without requiring them to upload their data to a central place.
This paper proposes a novel federated learning framework called DP-PASGD for training a machine learning model efficiently from the data stored across resource-constrained smart devices in IoT.
arXiv Detail & Related papers (2020-03-28T04:32:54Z) - Joint Parameter-and-Bandwidth Allocation for Improving the Efficiency of
Partitioned Edge Learning [73.82875010696849]
Machine learning algorithms are deployed at the network edge for training artificial intelligence (AI) models.
This paper focuses on the novel joint design of parameter (computation load) allocation and bandwidth allocation.
arXiv Detail & Related papers (2020-03-10T05:52:15Z) - Deep Learning for Ultra-Reliable and Low-Latency Communications in 6G
Networks [84.2155885234293]
We first summarize how to apply data-driven supervised deep learning and deep reinforcement learning in URLLC.
To address these open problems, we develop a multi-level architecture that enables device intelligence, edge intelligence, and cloud intelligence for URLLC.
arXiv Detail & Related papers (2020-02-22T14:38:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.