Scheduling Inference Workloads on Distributed Edge Clusters with
Reinforcement Learning
- URL: http://arxiv.org/abs/2301.13618v1
- Date: Tue, 31 Jan 2023 13:23:34 GMT
- Title: Scheduling Inference Workloads on Distributed Edge Clusters with
Reinforcement Learning
- Authors: Gabriele Castellano, Juan-Jos\'e Nieto, Jordi Luque, Ferr\'an Diego,
Carlos Segura, Diego Perino, Flavio Esposito, Fulvio Risso, Aravindh Raman
- Abstract summary: This paper focuses on the problem of scheduling inference queries on Deep Neural Networks in edge networks at short timescales.
By means of simulations, we analyze several policies in the realistic network settings and workloads of a large ISP.
We design ASET, a Reinforcement Learning based scheduling algorithm able to adapt its decisions according to the system conditions.
- Score: 11.007816552466952
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Many real-time applications (e.g., Augmented/Virtual Reality, cognitive
assistance) rely on Deep Neural Networks (DNNs) to process inference tasks.
Edge computing is considered a key infrastructure to deploy such applications,
as moving computation close to the data sources enables us to meet stringent
latency and throughput requirements. However, the constrained nature of edge
networks poses several additional challenges to the management of inference
workloads: edge clusters can not provide unlimited processing power to DNN
models, and often a trade-off between network and processing time should be
considered when it comes to end-to-end delay requirements. In this paper, we
focus on the problem of scheduling inference queries on DNN models in edge
networks at short timescales (i.e., few milliseconds). By means of simulations,
we analyze several policies in the realistic network settings and workloads of
a large ISP, highlighting the need for a dynamic scheduling policy that can
adapt to network conditions and workloads. We therefore design ASET, a
Reinforcement Learning based scheduling algorithm able to adapt its decisions
according to the system conditions. Our results show that ASET effectively
provides the best performance compared to static policies when scheduling over
a distributed pool of edge resources.
Related papers
- DNN Partitioning, Task Offloading, and Resource Allocation in Dynamic Vehicular Networks: A Lyapunov-Guided Diffusion-Based Reinforcement Learning Approach [49.56404236394601]
We formulate the problem of joint DNN partitioning, task offloading, and resource allocation in Vehicular Edge Computing.
Our objective is to minimize the DNN-based task completion time while guaranteeing the system stability over time.
We propose a Multi-Agent Diffusion-based Deep Reinforcement Learning (MAD2RL) algorithm, incorporating the innovative use of diffusion models.
arXiv Detail & Related papers (2024-06-11T06:31:03Z) - A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical
Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs)
MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z) - Dynamic Scheduling for Federated Edge Learning with Streaming Data [56.91063444859008]
We consider a Federated Edge Learning (FEEL) system where training data are randomly generated over time at a set of distributed edge devices with long-term energy constraints.
Due to limited communication resources and latency requirements, only a subset of devices is scheduled for participating in the local training process in every iteration.
arXiv Detail & Related papers (2023-05-02T07:41:16Z) - Accuracy-Guaranteed Collaborative DNN Inference in Industrial IoT via
Deep Reinforcement Learning [10.223526707269537]
Collaboration among industrial Internet of Things (IoT) devices and edge networks is essential to support computation-intensive deep neural network (DNN) inference services.
In this paper, we investigate the collaborative inference problem in industrial IoT networks.
arXiv Detail & Related papers (2022-12-31T05:53:17Z) - MCDS: AI Augmented Workflow Scheduling in Mobile Edge Cloud Computing
Systems [12.215537834860699]
Recently proposed scheduling methods leverage the low response times of edge computing platforms to optimize application Quality of Service (QoS)
We propose MCDS: Monte Carlo Learning using Deep Surrogate Models to efficiently schedule workflow applications in mobile edge-cloud computing systems.
arXiv Detail & Related papers (2021-12-14T10:00:01Z) - Architecture Aware Latency Constrained Sparse Neural Networks [35.50683537052815]
In this paper, we design an architecture aware latency constrained sparse framework to prune and accelerate CNN models.
We also propose a novel sparse convolution algorithm for efficient computation.
Our system-algorithm co-design framework can achieve much better frontier among network accuracy and latency on resource-constrained mobile devices.
arXiv Detail & Related papers (2021-09-01T03:41:31Z) - Multi-Exit Semantic Segmentation Networks [78.44441236864057]
We propose a framework for converting state-of-the-art segmentation models to MESS networks.
specially trained CNNs that employ parametrised early exits along their depth to save during inference on easier samples.
We co-optimise the number, placement and architecture of the attached segmentation heads, along with the exit policy, to adapt to the device capabilities and application-specific requirements.
arXiv Detail & Related papers (2021-06-07T11:37:03Z) - Better than the Best: Gradient-based Improper Reinforcement Learning for
Network Scheduling [60.48359567964899]
We consider the problem of scheduling in constrained queueing networks with a view to minimizing packet delay.
We use a policy gradient based reinforcement learning algorithm that produces a scheduler that performs better than the available atomic policies.
arXiv Detail & Related papers (2021-05-01T10:18:34Z) - Deep Reinforcement Learning for Resource Constrained Multiclass
Scheduling in Wireless Networks [0.0]
In our setup, the available limited bandwidth resources are allocated in order to serve randomly arriving service demands.
We propose a distributional Deep Deterministic Policy Gradient (DDPG) algorithm combined with Deep Sets to tackle the problem.
Our proposed algorithm is tested on both synthetic and real data, showing consistent gains against state-of-the-art conventional methods.
arXiv Detail & Related papers (2020-11-27T09:49:38Z) - Deep Learning for Ultra-Reliable and Low-Latency Communications in 6G
Networks [84.2155885234293]
We first summarize how to apply data-driven supervised deep learning and deep reinforcement learning in URLLC.
To address these open problems, we develop a multi-level architecture that enables device intelligence, edge intelligence, and cloud intelligence for URLLC.
arXiv Detail & Related papers (2020-02-22T14:38:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.