Solving the flexible job-shop scheduling problem through an enhanced
deep reinforcement learning approach
- URL: http://arxiv.org/abs/2310.15706v2
- Date: Tue, 30 Jan 2024 08:05:17 GMT
- Title: Solving the flexible job-shop scheduling problem through an enhanced
deep reinforcement learning approach
- Authors: Imanol Echeverria, Maialen Murua, Roberto Santana
- Abstract summary: This paper introduces a new DRL method for solving the flexible job-shop scheduling problem, particularly for large instances.
The approach is based on the use of heterogeneous graph neural networks to a more informative graph representation of the problem.
- Score: 1.565361244756411
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In scheduling problems common in the industry and various real-world
scenarios, responding in real-time to disruptive events is essential. Recent
methods propose the use of deep reinforcement learning (DRL) to learn policies
capable of generating solutions under this constraint. The objective of this
paper is to introduce a new DRL method for solving the flexible job-shop
scheduling problem, particularly for large instances. The approach is based on
the use of heterogeneous graph neural networks to a more informative graph
representation of the problem. This novel modeling of the problem enhances the
policy's ability to capture state information and improve its decision-making
capacity. Additionally, we introduce two novel approaches to enhance the
performance of the DRL approach: the first involves generating a diverse set of
scheduling policies, while the second combines DRL with dispatching rules (DRs)
constraining the action space. Experimental results on two public benchmarks
show that our approach outperforms DRs and achieves superior results compared
to three state-of-the-art DRL methods, particularly for large instances.
Related papers
- Offline reinforcement learning for job-shop scheduling problems [1.3927943269211593]
This paper introduces a novel offline RL method designed for optimization problems with complex constraints.
Our approach encodes actions in edge attributes and balances expected rewards with the imitation of expert solutions.
We demonstrate the effectiveness of this method on job-shop scheduling and flexible job-shop scheduling benchmarks.
arXiv Detail & Related papers (2024-10-21T07:33:42Z) - Leveraging Constraint Programming in a Deep Learning Approach for Dynamically Solving the Flexible Job-Shop Scheduling Problem [1.3927943269211593]
This paper aims to integrate constraint programming (CP) within a deep learning (DL) based methodology, leveraging the benefits of both.
We introduce a method that involves training a DL model using optimal solutions generated by CP, ensuring the model learns from high-quality data.
Our hybrid approach has been extensively tested on three public FJSSP benchmarks, demonstrating superior performance over five state-of-the-art DRL approaches.
arXiv Detail & Related papers (2024-03-14T10:16:57Z) - Bridging Distributionally Robust Learning and Offline RL: An Approach to
Mitigate Distribution Shift and Partial Data Coverage [32.578787778183546]
offline reinforcement learning (RL) algorithms learn optimal polices using historical (offline) data.
One of the main challenges in offline RL is the distribution shift.
We propose two offline RL algorithms using the distributionally robust learning (DRL) framework.
arXiv Detail & Related papers (2023-10-27T19:19:30Z) - Flexible Job Shop Scheduling via Dual Attention Network Based
Reinforcement Learning [73.19312285906891]
In flexible job shop scheduling problem (FJSP), operations can be processed on multiple machines, leading to intricate relationships between operations and machines.
Recent works have employed deep reinforcement learning (DRL) to learn priority dispatching rules (PDRs) for solving FJSP.
This paper presents a novel end-to-end learning framework that weds the merits of self-attention models for deep feature extraction and DRL for scalable decision-making.
arXiv Detail & Related papers (2023-05-09T01:35:48Z) - Offline Policy Optimization in RL with Variance Regularizaton [142.87345258222942]
We propose variance regularization for offline RL algorithms, using stationary distribution corrections.
We show that by using Fenchel duality, we can avoid double sampling issues for computing the gradient of the variance regularizer.
The proposed algorithm for offline variance regularization (OVAR) can be used to augment any existing offline policy optimization algorithms.
arXiv Detail & Related papers (2022-12-29T18:25:01Z) - Deep Reinforcement Learning Guided Improvement Heuristic for Job Shop
Scheduling [30.45126420996238]
This paper proposes a novel DRL-guided improvement for solving JSSP, where graph representation is employed to encode complete solutions.
We design a Graph Neural-Network-based representation scheme, consisting of two modules to effectively capture the information of dynamic topology and different types of nodes in graphs encountered during the improvement process.
We prove that our method scales linearly with problem size. Experiments on classic benchmarks show that the improvement policy learned by our method outperforms state-of-the-art DRL-based methods by a large margin.
arXiv Detail & Related papers (2022-11-20T10:20:13Z) - Diffusion Policies as an Expressive Policy Class for Offline
Reinforcement Learning [70.20191211010847]
Offline reinforcement learning (RL) aims to learn an optimal policy using a previously collected static dataset.
We introduce Diffusion Q-learning (Diffusion-QL) that utilizes a conditional diffusion model to represent the policy.
We show that our method can achieve state-of-the-art performance on the majority of the D4RL benchmark tasks.
arXiv Detail & Related papers (2022-08-12T09:54:11Z) - DL-DRL: A double-level deep reinforcement learning approach for
large-scale task scheduling of multi-UAV [65.07776277630228]
We propose a double-level deep reinforcement learning (DL-DRL) approach based on a divide and conquer framework (DCF)
Particularly, we design an encoder-decoder structured policy network in our upper-level DRL model to allocate the tasks to different UAVs.
We also exploit another attention based policy network in our lower-level DRL model to construct the route for each UAV, with the objective to maximize the number of executed tasks.
arXiv Detail & Related papers (2022-08-04T04:35:53Z) - Jump-Start Reinforcement Learning [68.82380421479675]
We present a meta algorithm that can use offline data, demonstrations, or a pre-existing policy to initialize an RL policy.
In particular, we propose Jump-Start Reinforcement Learning (JSRL), an algorithm that employs two policies to solve tasks.
We show via experiments that JSRL is able to significantly outperform existing imitation and reinforcement learning algorithms.
arXiv Detail & Related papers (2022-04-05T17:25:22Z) - Improving Generalization of Deep Reinforcement Learning-based TSP
Solvers [19.29028564568974]
We propose a novel approach named MAGIC that includes a deep learning architecture and a DRL training method.
Our architecture, which integrates a multilayer perceptron, a graph neural network, and an attention model, defines a policy that sequentially generates a traveling salesman solution.
Our training method includes several innovations: (1) we interleave DRL policy updates with local search (using a new local search technique), (2) we use a novel simple baseline, and (3) we apply gradient learning.
arXiv Detail & Related papers (2021-10-06T15:16:19Z) - An Online Method for A Class of Distributionally Robust Optimization
with Non-Convex Objectives [54.29001037565384]
We propose a practical online method for solving a class of online distributionally robust optimization (DRO) problems.
Our studies demonstrate important applications in machine learning for improving the robustness of networks.
arXiv Detail & Related papers (2020-06-17T20:19:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.