An efficient deep reinforcement learning environment for flexible job-shop scheduling
- URL: http://arxiv.org/abs/2509.07019v1
- Date: Sun, 07 Sep 2025 02:50:09 GMT
- Title: An efficient deep reinforcement learning environment for flexible job-shop scheduling
- Authors: Xinquan Wu, Xuefeng Yan, Mingqiang Wei, Donghai Guan,
- Abstract summary: This paper presents a simple chronological DRL environment for FJSP based on discrete event simulation.<n>An end-to-end DRL scheduling model is proposed based on the proximal policy optimization (PPO)
- Score: 34.437280280525734
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The Flexible Job-shop Scheduling Problem (FJSP) is a classical combinatorial optimization problem that has a wide-range of applications in the real world. In order to generate fast and accurate scheduling solutions for FJSP, various deep reinforcement learning (DRL) scheduling methods have been developed. However, these methods are mainly focused on the design of DRL scheduling Agent, overlooking the modeling of DRL environment. This paper presents a simple chronological DRL environment for FJSP based on discrete event simulation and an end-to-end DRL scheduling model is proposed based on the proximal policy optimization (PPO). Furthermore, a short novel state representation of FJSP is proposed based on two state variables in the scheduling environment and a novel comprehensible reward function is designed based on the scheduling area of machines. Experimental results on public benchmark instances show that the performance of simple priority dispatching rules (PDR) is improved in our scheduling environment and our DRL scheduling model obtains competing performance compared with OR-Tools, meta-heuristic, DRL and PDR scheduling methods.
Related papers
- Decentralized Distributed Proximal Policy Optimization (DD-PPO) for High Performance Computing Scheduling on Multi-User Systems [45.62643537023675]
This study introduces a novel RL-based scheduler utilizing the Decentralized Distributed Proximal Policy Optimization (DD-PPO) algorithm.<n>The DD-PPO algorithm supports large-scale distributed training across multiple workers without requiring parameter synchronization at every step.<n>The validation dataset leveraged over 11.5 million real HPC job traces for comparing DD-PPO performance between traditional and advanced scheduling approaches.
arXiv Detail & Related papers (2025-05-06T19:50:37Z) - Learning-enabled Flexible Job-shop Scheduling for Scalable Smart
Manufacturing [11.509669981978874]
In smart manufacturing systems, flexible job-shop scheduling with transportation constraints is essential to optimize solutions for maximizing productivity.
Recent developments in deep reinforcement learning (DRL)-based methods for FJSPT have encountered a scale generalization challenge.
We introduce a novel graph-based DRL method, named the Heterogeneous Graph Scheduler (HGS)
arXiv Detail & Related papers (2024-02-14T06:49:23Z) - Deep reinforcement learning for machine scheduling: Methodology, the
state-of-the-art, and future directions [2.4541568670428915]
Machine scheduling aims to optimize job assignments to machines while adhering to manufacturing rules and job specifications.
Deep Reinforcement Learning (DRL), a key component of artificial general intelligence, has shown promise in various domains like gaming and robotics.
This paper offers a comprehensive review and comparison of DRL-based approaches, highlighting their methodology, applications, advantages, and limitations.
arXiv Detail & Related papers (2023-10-04T22:45:09Z) - Accelerating Exact Combinatorial Optimization via RL-based
Initialization -- A Case Study in Scheduling [1.3053649021965603]
This research aims to develop an innovative approach that employs machine learning (ML) for addressing optimization problems.
We introduce a novel two-phase RL-to-ILP scheduling framework, which includes three steps: 1) solver as coarse-grain scheduler, 2) solution relaxation and 3) exact solving via ILP.
Our framework demonstrates the same scheduling performance compared with using exact scheduling methods while achieving up to 128 $times$ speed improvements.
arXiv Detail & Related papers (2023-08-19T15:52:43Z) - Maximize to Explore: One Objective Function Fusing Estimation, Planning,
and Exploration [87.53543137162488]
We propose an easy-to-implement online reinforcement learning (online RL) framework called textttMEX.
textttMEX integrates estimation and planning components while balancing exploration exploitation automatically.
It can outperform baselines by a stable margin in various MuJoCo environments with sparse rewards.
arXiv Detail & Related papers (2023-05-29T17:25:26Z) - Flexible Job Shop Scheduling via Dual Attention Network Based
Reinforcement Learning [73.19312285906891]
In flexible job shop scheduling problem (FJSP), operations can be processed on multiple machines, leading to intricate relationships between operations and machines.
Recent works have employed deep reinforcement learning (DRL) to learn priority dispatching rules (PDRs) for solving FJSP.
This paper presents a novel end-to-end learning framework that weds the merits of self-attention models for deep feature extraction and DRL for scalable decision-making.
arXiv Detail & Related papers (2023-05-09T01:35:48Z) - Towards Deployment-Efficient Reinforcement Learning: Lower Bound and
Optimality [141.89413461337324]
Deployment efficiency is an important criterion for many real-world applications of reinforcement learning (RL)
We propose a theoretical formulation for deployment-efficient RL (DE-RL) from an "optimization with constraints" perspective.
arXiv Detail & Related papers (2022-02-14T01:31:46Z) - Towards Standardizing Reinforcement Learning Approaches for Stochastic
Production Scheduling [77.34726150561087]
reinforcement learning can be used to solve scheduling problems.
Existing studies rely on (sometimes) complex simulations for which the code is unavailable.
There is a vast array of RL designs to choose from.
standardization of model descriptions - both production setup and RL design - and validation scheme are a prerequisite.
arXiv Detail & Related papers (2021-04-16T16:07:10Z) - A Reinforcement Learning Environment For Job-Shop Scheduling [2.036811219647753]
This paper presents an efficient Deep Reinforcement Learning environment for Job-Shop Scheduling.
We design a meaningful and compact state representation as well as a novel, simple dense reward function.
We demonstrate that our approach significantly outperforms existing DRL methods on classic benchmark instances.
arXiv Detail & Related papers (2021-04-08T13:26:30Z) - Combining Deep Learning and Optimization for Security-Constrained
Optimal Power Flow [94.24763814458686]
Security-constrained optimal power flow (SCOPF) is fundamental in power systems.
Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs.
This paper proposes a novel approach that combines deep learning and robust optimization techniques.
arXiv Detail & Related papers (2020-07-14T12:38:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.