Curriculum Learning in Job Shop Scheduling using Reinforcement Learning
- URL: http://arxiv.org/abs/2305.10192v1
- Date: Wed, 17 May 2023 13:15:27 GMT
- Title: Curriculum Learning in Job Shop Scheduling using Reinforcement Learning
- Authors: Constantin Waubert de Puiseau, Hasan Tercan, Tobias Meisen
- Abstract summary: Deep Reinforcement Learning (DRL) dynamically adjusts an agent's planning strategy in response to difficult instances.
We further improve DLR as an underlying method by actively incorporating the variability of difficulty within the same problem size into the design of the learning process.
- Score: 0.3867363075280544
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Solving job shop scheduling problems (JSSPs) with a fixed strategy, such as a
priority dispatching rule, may yield satisfactory results for several problem
instances but, nevertheless, insufficient results for others. From this
single-strategy perspective finding a near optimal solution to a specific JSSP
varies in difficulty even if the machine setup remains the same. A recent
intensively researched and promising method to deal with difficulty variability
is Deep Reinforcement Learning (DRL), which dynamically adjusts an agent's
planning strategy in response to difficult instances not only during training,
but also when applied to new situations. In this paper, we further improve DLR
as an underlying method by actively incorporating the variability of difficulty
within the same problem size into the design of the learning process. We base
our approach on a state-of-the-art methodology that solves JSSP by means of DRL
and graph neural network embeddings. Our work supplements the training routine
of the agent by a curriculum learning strategy that ranks the problem instances
shown during training by a new metric of problem instance difficulty. Our
results show that certain curricula lead to significantly better performances
of the DRL solutions. Agents trained on these curricula beat the top
performance of those trained on randomly distributed training data, reaching
3.2% shorter average makespans.
Related papers
- On Task Performance and Model Calibration with Supervised and
Self-Ensembled In-Context Learning [71.44986275228747]
In-context learning (ICL) has become an efficient approach propelled by the recent advancements in large language models (LLMs)
However, both paradigms are prone to suffer from the critical problem of overconfidence (i.e., miscalibration)
arXiv Detail & Related papers (2023-12-21T11:55:10Z) - A Reinforcement Learning-assisted Genetic Programming Algorithm for Team
Formation Problem Considering Person-Job Matching [70.28786574064694]
A reinforcement learning-assisted genetic programming algorithm (RL-GP) is proposed to enhance the quality of solutions.
The hyper-heuristic rules obtained through efficient learning can be utilized as decision-making aids when forming project teams.
arXiv Detail & Related papers (2023-04-08T14:32:12Z) - On the Effectiveness of Fine-tuning Versus Meta-reinforcement Learning [71.55412580325743]
We show that multi-task pretraining with fine-tuning on new tasks performs equally as well, or better, than meta-pretraining with meta test-time adaptation.
This is encouraging for future research, as multi-task pretraining tends to be simpler and computationally cheaper than meta-RL.
arXiv Detail & Related papers (2022-06-07T13:24:00Z) - Sample-Efficient, Exploration-Based Policy Optimisation for Routing
Problems [2.6782615615913348]
This paper presents a new reinforcement learning approach that is based on entropy.
In addition, we design an off-policy-based reinforcement learning technique that maximises the expected return.
We show that our model can generalise to various route problems.
arXiv Detail & Related papers (2022-05-31T09:51:48Z) - Jump-Start Reinforcement Learning [68.82380421479675]
We present a meta algorithm that can use offline data, demonstrations, or a pre-existing policy to initialize an RL policy.
In particular, we propose Jump-Start Reinforcement Learning (JSRL), an algorithm that employs two policies to solve tasks.
We show via experiments that JSRL is able to significantly outperform existing imitation and reinforcement learning algorithms.
arXiv Detail & Related papers (2022-04-05T17:25:22Z) - An actor-critic algorithm with policy gradients to solve the job shop
scheduling problem using deep double recurrent agents [1.3812010983144802]
We propose a deep reinforcement learning methodology for the job shop scheduling problem (JSSP)
The aim is to build up a greedy-like able to learn on some distribution of JSSP instances, different in the number of jobs and machines.
As expected, the model can generalize, to some extent, to larger problems or instances originated by a different distribution from the one used in training.
arXiv Detail & Related papers (2021-10-18T07:55:39Z) - A Novel Automated Curriculum Strategy to Solve Hard Sokoban Planning
Instances [30.32386551923329]
We present a curriculum-driven learning approach that is designed to solve a single hard instance.
We show how the smoothness of the task hardness impacts the final learning results.
Our approach can uncover plans that are far out of reach for any previous state-of-the-art Sokoban solver.
arXiv Detail & Related papers (2021-10-03T00:44:50Z) - Simplifying Deep Reinforcement Learning via Self-Supervision [51.2400839966489]
Self-Supervised Reinforcement Learning (SSRL) is a simple algorithm that optimize policies with purely supervised losses.
We show that SSRL is surprisingly competitive to contemporary algorithms with more stable performance and less running time.
arXiv Detail & Related papers (2021-06-10T06:29:59Z) - Curriculum Learning with Diversity for Supervised Computer Vision Tasks [1.5229257192293197]
We introduce a novel curriculum sampling strategy which takes into consideration the diversity of the training data together with the difficulty of the inputs.
We prove that our strategy is very efficient for unbalanced data sets, leading to faster convergence and more accurate results.
arXiv Detail & Related papers (2020-09-22T15:32:49Z) - Learning Adaptive Exploration Strategies in Dynamic Environments Through
Informed Policy Regularization [100.72335252255989]
We study the problem of learning exploration-exploitation strategies that effectively adapt to dynamic environments.
We propose a novel algorithm that regularizes the training of an RNN-based policy using informed policies trained to maximize the reward in each task.
arXiv Detail & Related papers (2020-05-06T16:14:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.