Multi-level Explanation of Deep Reinforcement Learning-based Scheduling
- URL: http://arxiv.org/abs/2209.09645v1
- Date: Sun, 18 Sep 2022 13:22:53 GMT
- Title: Multi-level Explanation of Deep Reinforcement Learning-based Scheduling
- Authors: Shaojun Zhang and Chen Wang and Albert Zomaya
- Abstract summary: Dependency-aware job scheduling in the cluster is NP-hard.
Recent work shows that Deep Reinforcement Learning (DRL) is capable of solving it.
In this paper, we give the multi-level explanation framework to interpret the policy of DRL-based scheduling.
- Score: 3.043569093713764
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Dependency-aware job scheduling in the cluster is NP-hard. Recent work shows
that Deep Reinforcement Learning (DRL) is capable of solving it. It is
difficult for the administrator to understand the DRL-based policy even though
it achieves remarkable performance gain. Therefore the complex model-based
scheduler is not easy to gain trust in the system where simplicity is favored.
In this paper, we give the multi-level explanation framework to interpret the
policy of DRL-based scheduling. We dissect its decision-making process to job
level and task level and approximate each level with interpretable models and
rules, which align with operational practices. We show that the framework gives
the system administrator insights into the state-of-the-art scheduler and
reveals the robustness issue in regards to its behavior pattern.
Related papers
- Sample-Efficient Reinforcement Learning with Temporal Logic Objectives: Leveraging the Task Specification to Guide Exploration [13.053013407015628]
This paper addresses the problem of learning optimal control policies for systems with uncertain dynamics.
We propose an accelerated RL algorithm that can learn control policies significantly faster than competitive approaches.
arXiv Detail & Related papers (2024-10-16T00:53:41Z) - Learning Logic Specifications for Policy Guidance in POMDPs: an
Inductive Logic Programming Approach [57.788675205519986]
We learn high-quality traces from POMDP executions generated by any solver.
We exploit data- and time-efficient Indu Logic Programming (ILP) to generate interpretable belief-based policy specifications.
We show that learneds expressed in Answer Set Programming (ASP) yield performance superior to neural networks and similar to optimal handcrafted task-specifics within lower computational time.
arXiv Detail & Related papers (2024-02-29T15:36:01Z) - Learning-enabled Flexible Job-shop Scheduling for Scalable Smart
Manufacturing [11.509669981978874]
In smart manufacturing systems, flexible job-shop scheduling with transportation constraints is essential to optimize solutions for maximizing productivity.
Recent developments in deep reinforcement learning (DRL)-based methods for FJSPT have encountered a scale generalization challenge.
We introduce a novel graph-based DRL method, named the Heterogeneous Graph Scheduler (HGS)
arXiv Detail & Related papers (2024-02-14T06:49:23Z) - Hierarchical Continual Reinforcement Learning via Large Language Model [15.837883929274758]
Hi-Core is designed to facilitate the transfer of high-level knowledge.
It orchestrates a twolayer structure: high-level policy formulation by a large language model (LLM)
Hi-Core has demonstrated its effectiveness in handling diverse CRL tasks, which outperforms popular baselines.
arXiv Detail & Related papers (2024-01-25T03:06:51Z) - Semantically Aligned Task Decomposition in Multi-Agent Reinforcement
Learning [56.26889258704261]
We propose a novel "disentangled" decision-making method, Semantically Aligned task decomposition in MARL (SAMA)
SAMA prompts pretrained language models with chain-of-thought that can suggest potential goals, provide suitable goal decomposition and subgoal allocation as well as self-reflection-based replanning.
SAMA demonstrates considerable advantages in sample efficiency compared to state-of-the-art ASG methods.
arXiv Detail & Related papers (2023-05-18T10:37:54Z) - Compositional Reinforcement Learning from Logical Specifications [21.193231846438895]
Recent approaches automatically generate a reward function from a given specification and use a suitable reinforcement learning algorithm to learn a policy.
We develop a compositional learning approach, called DiRL, that interleaves high-level planning and reinforcement learning.
Our approach then incorporates reinforcement learning to learn neural network policies for each edge (sub-task) within a Dijkstra-style planning algorithm to compute a high-level plan in the graph.
arXiv Detail & Related papers (2021-06-25T22:54:28Z) - Deep RL With Information Constrained Policies: Generalization in
Continuous Control [21.46148507577606]
We show that a natural constraint on information flow might confer onto artificial agents in continuous control tasks.
We implement a novel Capacity-Limited Actor-Critic (CLAC) algorithm.
Our experiments show that compared to alternative approaches, CLAC offers improvements in generalization between training and modified test environments.
arXiv Detail & Related papers (2020-10-09T15:42:21Z) - Learning Robust State Abstractions for Hidden-Parameter Block MDPs [55.31018404591743]
We leverage ideas of common structure from the HiP-MDP setting to enable robust state abstractions inspired by Block MDPs.
We derive instantiations of this new framework for both multi-task reinforcement learning (MTRL) and meta-reinforcement learning (Meta-RL) settings.
arXiv Detail & Related papers (2020-07-14T17:25:27Z) - Hierarchical Reinforcement Learning as a Model of Human Task
Interleaving [60.95424607008241]
We develop a hierarchical model of supervisory control driven by reinforcement learning.
The model reproduces known empirical effects of task interleaving.
The results support hierarchical RL as a plausible model of task interleaving.
arXiv Detail & Related papers (2020-01-04T17:53:28Z) - Hierarchical Variational Imitation Learning of Control Programs [131.7671843857375]
We propose a variational inference method for imitation learning of a control policy represented by parametrized hierarchical procedures (PHP)
Our method discovers the hierarchical structure in a dataset of observation-action traces of teacher demonstrations, by learning an approximate posterior distribution over the latent sequence of procedure calls and terminations.
We demonstrate a novel benefit of variational inference in the context of hierarchical imitation learning: in decomposing the policy into simpler procedures, inference can leverage acausal information that is unused by other methods.
arXiv Detail & Related papers (2019-12-29T08:57:02Z) - Certified Reinforcement Learning with Logic Guidance [78.2286146954051]
We propose a model-free RL algorithm that enables the use of Linear Temporal Logic (LTL) to formulate a goal for unknown continuous-state/action Markov Decision Processes (MDPs)
The algorithm is guaranteed to synthesise a control policy whose traces satisfy the specification with maximal probability.
arXiv Detail & Related papers (2019-02-02T20:09:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.