An actor-critic algorithm with policy gradients to solve the job shop
scheduling problem using deep double recurrent agents
- URL: http://arxiv.org/abs/2110.09076v2
- Date: Tue, 21 Nov 2023 12:23:27 GMT
- Title: An actor-critic algorithm with policy gradients to solve the job shop
scheduling problem using deep double recurrent agents
- Authors: Marta Monaci, Valerio Agasucci and Giorgio Grani
- Abstract summary: We propose a deep reinforcement learning methodology for the job shop scheduling problem (JSSP)
The aim is to build up a greedy-like able to learn on some distribution of JSSP instances, different in the number of jobs and machines.
As expected, the model can generalize, to some extent, to larger problems or instances originated by a different distribution from the one used in training.
- Score: 1.3812010983144802
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: There is a growing interest in integrating machine learning techniques and
optimization to solve challenging optimization problems. In this work, we
propose a deep reinforcement learning methodology for the job shop scheduling
problem (JSSP). The aim is to build up a greedy-like heuristic able to learn on
some distribution of JSSP instances, different in the number of jobs and
machines. The need for fast scheduling methods is well known, and it arises in
many areas, from transportation to healthcare. We model the JSSP as a Markov
Decision Process and then we exploit the efficacy of reinforcement learning to
solve the problem. We adopt an actor-critic scheme, where the action taken by
the agent is influenced by policy considerations on the state-value function.
The procedures are adapted to take into account the challenging nature of JSSP,
where the state and the action space change not only for every instance but
also after each decision. To tackle the variability in the number of jobs and
operations in the input, we modeled the agent using two incident LSTM models, a
special type of deep neural network. Experiments show the algorithm reaches
good solutions in a short time, proving that is possible to generate new greedy
heuristics just from learning-based methodologies. Benchmarks have been
generated in comparison with the commercial solver CPLEX. As expected, the
model can generalize, to some extent, to larger problems or instances
originated by a different distribution from the one used in training.
Related papers
- Two-Timescale Model Caching and Resource Allocation for Edge-Enabled AI-Generated Content Services [55.0337199834612]
Generative AI (GenAI) has emerged as a transformative technology, enabling customized and personalized AI-generated content (AIGC) services.
These services require executing GenAI models with billions of parameters, posing significant obstacles to resource-limited wireless edge.
We introduce the formulation of joint model caching and resource allocation for AIGC services to balance a trade-off between AIGC quality and latency metrics.
arXiv Detail & Related papers (2024-11-03T07:01:13Z) - Decision Transformer for Enhancing Neural Local Search on the Job Shop Scheduling Problem [10.316443594063173]
Job shop scheduling problem (JSSP) and its solution algorithms have been of enduring interest in both academia and industry for decades.
In recent years, machine learning (ML) is playing an increasingly important role in advancing existing and building new solutions for the JSSP, aiming to find better solutions in shorter times.
We build on top of a state-of-the-art deep reinforcement learning (DRL) agent, called Neural Local Search (NLS), which can efficiently and effectively control a large local neighborhood search on the JSSP.
arXiv Detail & Related papers (2024-09-04T13:33:38Z) - An End-to-End Reinforcement Learning Approach for Job-Shop Scheduling
Problems Based on Constraint Programming [5.070542698701157]
This paper proposes a novel end-to-end approach to solving scheduling problems by means of CP and Reinforcement Learning (RL)
Our approach leverages existing CP solvers to train an agent learning a Priority Dispatching Rule (PDR) that generalizes well to large instances, even from separate datasets.
arXiv Detail & Related papers (2023-06-09T08:24:56Z) - Curriculum Learning in Job Shop Scheduling using Reinforcement Learning [0.3867363075280544]
Deep Reinforcement Learning (DRL) dynamically adjusts an agent's planning strategy in response to difficult instances.
We further improve DLR as an underlying method by actively incorporating the variability of difficulty within the same problem size into the design of the learning process.
arXiv Detail & Related papers (2023-05-17T13:15:27Z) - Learning to Optimize Permutation Flow Shop Scheduling via Graph-based
Imitation Learning [70.65666982566655]
Permutation flow shop scheduling (PFSS) is widely used in manufacturing systems.
We propose to train the model via expert-driven imitation learning, which accelerates convergence more stably and accurately.
Our model's network parameters are reduced to only 37% of theirs, and the solution gap of our model towards the expert solutions decreases from 6.8% to 1.3% on average.
arXiv Detail & Related papers (2022-10-31T09:46:26Z) - Accelerated Policy Learning with Parallel Differentiable Simulation [59.665651562534755]
We present a differentiable simulator and a new policy learning algorithm (SHAC)
Our algorithm alleviates problems with local minima through a smooth critic function.
We show substantial improvements in sample efficiency and wall-clock time over state-of-the-art RL and differentiable simulation-based algorithms.
arXiv Detail & Related papers (2022-04-14T17:46:26Z) - Learning to Stop While Learning to Predict [85.7136203122784]
Many algorithm-inspired deep models are restricted to a fixed-depth'' for all inputs.
Similar to algorithms, the optimal depth of a deep architecture may be different for different input instances.
In this paper, we tackle this varying depth problem using a steerable architecture.
We show that the learned deep model along with the stopping policy improves the performances on a diverse set of tasks.
arXiv Detail & Related papers (2020-06-09T07:22:01Z) - Dynamic Multi-Robot Task Allocation under Uncertainty and Temporal
Constraints [52.58352707495122]
We present a multi-robot allocation algorithm that decouples the key computational challenges of sequential decision-making under uncertainty and multi-agent coordination.
We validate our results over a wide range of simulations on two distinct domains: multi-arm conveyor belt pick-and-place and multi-drone delivery dispatch in a city.
arXiv Detail & Related papers (2020-05-27T01:10:41Z) - Physarum Powered Differentiable Linear Programming Layers and
Applications [48.77235931652611]
We propose an efficient and differentiable solver for general linear programming problems.
We show the use of our solver in a video segmentation task and meta-learning for few-shot learning.
arXiv Detail & Related papers (2020-04-30T01:50:37Z) - Model-based Multi-Agent Reinforcement Learning with Cooperative
Prioritized Sweeping [4.5497948012757865]
We present a new model-based reinforcement learning algorithm, Cooperative Prioritized Sweeping.
The algorithm allows for sample-efficient learning on large problems by exploiting a factorization to approximate the value function.
Our method outperforms the state-of-the-art algorithm sparse cooperative Q-learning algorithm, both on the well-known SysAdmin benchmark and randomized environments.
arXiv Detail & Related papers (2020-01-15T19:13:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.