GATES: Cost-aware Dynamic Workflow Scheduling via Graph Attention Networks and Evolution Strategy
- URL: http://arxiv.org/abs/2505.12355v2
- Date: Tue, 20 May 2025 01:15:11 GMT
- Title: GATES: Cost-aware Dynamic Workflow Scheduling via Graph Attention Networks and Evolution Strategy
- Authors: Ya Shen, Gang Chen, Hui Ma, Mengjie Zhang,
- Abstract summary: Cost-aware Dynamic Scheduling (CADWS) is a key challenge in cloud computing.<n>Deep reinforcement learning (DRL) has been widely employed for automated scheduling policy design.<n>This study proposes a novel DRL method combining Graph Attention Networks-based policy network and Evolution Strategy, referred to as GATES.
- Score: 7.653021685451039
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cost-aware Dynamic Workflow Scheduling (CADWS) is a key challenge in cloud computing, focusing on devising an effective scheduling policy to efficiently schedule dynamically arriving workflow tasks, represented as Directed Acyclic Graphs (DAG), to suitable virtual machines (VMs). Deep reinforcement learning (DRL) has been widely employed for automated scheduling policy design. However, the performance of DRL is heavily influenced by the design of the problem-tailored policy network and is highly sensitive to hyperparameters and the design of reward feedback. Considering the above-mentioned issues, this study proposes a novel DRL method combining Graph Attention Networks-based policy network and Evolution Strategy, referred to as GATES. The contributions of GATES are summarized as follows: (1) GATES can capture the impact of current task scheduling on subsequent tasks by learning the topological relationships between tasks in a DAG. (2) GATES can assess the importance of each VM to the ready task, enabling it to adapt to dynamically changing VM resources. (3) Utilizing Evolution Strategy's robustness, exploratory nature, and tolerance for delayed rewards, GATES achieves stable policy learning in CADWS. Extensive experimental results demonstrate the superiority of the proposed GATES in CADWS, outperforming several state-of-the-art algorithms. The source code is available at: https://github.com/YaShen998/GATES.
Related papers
- Improving Large Language Model Planning with Action Sequence Similarity [50.52049888490524]
In this work, we explore how to improve the model planning capability through in-context learning (ICL)<n>We propose GRASE-DC: a two-stage pipeline that first re-samples high AS exemplars and then curates the selected exemplars.<n>Our experimental result confirms that GRASE-DC achieves significant performance improvement on various planning tasks.
arXiv Detail & Related papers (2025-05-02T05:16:17Z) - Zero-Shot Whole-Body Humanoid Control via Behavioral Foundation Models [71.34520793462069]
Unsupervised reinforcement learning (RL) aims at pre-training agents that can solve a wide range of downstream tasks in complex environments.<n>We introduce a novel algorithm regularizing unsupervised RL towards imitating trajectories from unlabeled behavior datasets.<n>We demonstrate the effectiveness of this new approach in a challenging humanoid control problem.
arXiv Detail & Related papers (2025-04-15T10:41:11Z) - Plan-over-Graph: Towards Parallelable LLM Agent Schedule [53.834646147919436]
Large Language Models (LLMs) have demonstrated exceptional abilities in reasoning for task planning.<n>This paper introduces a novel paradigm, plan-over-graph, in which the model first decomposes a real-life textual task into executable subtasks and constructs an abstract task graph.<n>The model then understands this task graph as input and generates a plan for parallel execution.
arXiv Detail & Related papers (2025-02-20T13:47:51Z) - TS-EoH: An Edge Server Task Scheduling Algorithm Based on Evolution of Heuristic [0.6827423171182154]
This paper introduces a novel task-scheduling approach based on EC theory and Evolutionary algorithms.
Experimental results show that our task-scheduling algorithm outperforms existing and traditional reinforcement learning methods.
arXiv Detail & Related papers (2024-09-04T10:00:32Z) - Can Graph Learning Improve Planning in LLM-based Agents? [61.47027387839096]
Task planning in language agents is emerging as an important research topic alongside the development of large language models (LLMs)
In this paper, we explore graph learning-based methods for task planning, a direction that is to the prevalent focus on prompt design.
Our interest in graph learning stems from a theoretical discovery: the biases of attention and auto-regressive loss impede LLMs' ability to effectively navigate decision-making on graphs.
arXiv Detail & Related papers (2024-05-29T14:26:24Z) - Efficient Orchestrated AI Workflows Execution on Scale-out Spatial Architecture [17.516934379812994]
We present "Orchestrated AIs," an approach that integrates various tasks with logic-driven decisions into dynamic, sophisticated AIs.
We find that the intrinsic Dual Dynamicity of Orchestrated AIs can be effectively represented using the Orchestrated spatial Graph.
Our evaluations demonstrate that significantly outperforms traditional architectures in handling the dynamic demands of Orchestrated AIs.
arXiv Detail & Related papers (2024-05-21T14:09:31Z) - Intelligent Hybrid Resource Allocation in MEC-assisted RAN Slicing Network [72.2456220035229]
We aim to maximize the SSR for heterogeneous service demands in the cooperative MEC-assisted RAN slicing system.
We propose a recurrent graph reinforcement learning (RGRL) algorithm to intelligently learn the optimal hybrid RA policy.
arXiv Detail & Related papers (2024-05-02T01:36:13Z) - Action-Quantized Offline Reinforcement Learning for Robotic Skill
Learning [68.16998247593209]
offline reinforcement learning (RL) paradigm provides recipe to convert static behavior datasets into policies that can perform better than the policy that collected the data.
In this paper, we propose an adaptive scheme for action quantization.
We show that several state-of-the-art offline RL methods such as IQL, CQL, and BRAC improve in performance on benchmarks when combined with our proposed discretization scheme.
arXiv Detail & Related papers (2023-10-18T06:07:10Z) - Edge Generation Scheduling for DAG Tasks Using Deep Reinforcement
Learning [2.365237699556817]
Directed acyclic graph (DAG) tasks are currently adopted in the real-time domain to model complex applications.
We propose a new DAG scheduling framework that attempts to minimize the DAG width by iteratively generating edges.
We evaluate the effectiveness of the proposed algorithm by comparing it with state-of-the-art DAG schedulings and an optimal mixed-integer linear programming baseline.
arXiv Detail & Related papers (2023-08-28T15:19:18Z) - GA-DRL: Graph Neural Network-Augmented Deep Reinforcement Learning for
DAG Task Scheduling over Dynamic Vehicular Clouds [35.418964557667096]
We propose a graph neural network-augmented deep reinforcement learning scheme (GA-DRL) for scheduling DAG tasks over dynamic VCs.
GA-DRL outperforms existing benchmarks in terms of DAG task completion time.
arXiv Detail & Related papers (2023-07-03T06:41:15Z) - A Scalable Deep Reinforcement Learning Model for Online Scheduling
Coflows of Multi-Stage Jobs for High Performance Computing [9.866286878494979]
In multi-stage jobs, each job consists of multiple coflows and is represented by a Directed Acyclic Graph (DAG)
In this paper, we propose a novel Pipelined-DAGNN to process the input and propose a novel coflow scheduling algorithm.
arXiv Detail & Related papers (2021-12-21T09:36:55Z) - Efficient Dynamic Graph Representation Learning at Scale [66.62859857734104]
We propose Efficient Dynamic Graph lEarning (EDGE), which selectively expresses certain temporal dependency via training loss to improve the parallelism in computations.
We show that EDGE can scale to dynamic graphs with millions of nodes and hundreds of millions of temporal events and achieve new state-of-the-art (SOTA) performance.
arXiv Detail & Related papers (2021-12-14T22:24:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.