Adaptive Bias Generalized Rollout Policy Adaptation on the Flexible Job-Shop Scheduling Problem
- URL: http://arxiv.org/abs/2505.08451v2
- Date: Tue, 20 May 2025 09:34:45 GMT
- Title: Adaptive Bias Generalized Rollout Policy Adaptation on the Flexible Job-Shop Scheduling Problem
- Authors: Lotfi Kobrosly, Marc-Emmanuel Coupvent des Graviers, Christophe Guettier, Tristan Cazenave,
- Abstract summary: The Flexible Job-Shop Scheduling Problem (FJSSP) is an NP-hard optimization problem.<n>We propose a novel algorithm derived from the Generalized Nested Rollout Policy Adaptation, developed to solve the FJSSP.
- Score: 3.6266514127975906
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The Flexible Job-Shop Scheduling Problem (FJSSP) is an NP-hard combinatorial optimization problem, with several application domains, especially for manufacturing purposes. The objective is to efficiently schedule multiple operations on dissimilar machines. These operations are gathered into jobs, and operations pertaining to the same job need to be scheduled sequentially. Different methods have been previously tested to solve this problem, such as Constraint Solving, Tabu Search, Genetic Algorithms, or Monte Carlo Tree Search (MCTS). We propose a novel algorithm derived from the Generalized Nested Rollout Policy Adaptation, developed to solve the FJSSP. We report encouraging experimental results, as our algorithm performs better than other MCTS-based approaches, even if makespans obtained on large instances are still far from known upper bounds.
Related papers
- PlanGEN: A Multi-Agent Framework for Generating Planning and Reasoning Trajectories for Complex Problem Solving [89.60370366013142]
We propose PlanGEN, a model-agnostic and easily scalable agent framework with three key components: constraint, verification, and selection agents.<n>Specifically, our approach proposes constraint-guided iterative verification to enhance performance of inference-time algorithms.
arXiv Detail & Related papers (2025-02-22T06:21:56Z) - A Benchmarking Environment for Worker Flexibility in Flexible Job Shop Scheduling Problems [0.0]
In Production Scheduling, the Flexible Job Shop Scheduling Problem (FJSSP) aims to optimize a sequence of operations and assign each to an eligible machine with varying processing times.<n>The resulting problem is called Flexible Job Shop Scheduling Problem with Worker Flexibility (FJSSP-W)<n>This paper presents a collection of 402 commonly accepted FJSSP instances and proposes an approach to extend these with worker flexibility.
arXiv Detail & Related papers (2025-01-27T15:56:12Z) - JoinGym: An Efficient Query Optimization Environment for Reinforcement
Learning [58.71541261221863]
Join order selection (JOS) is the problem of ordering join operations to minimize total query execution cost.
We present JoinGym, a query optimization environment for bushy reinforcement learning (RL)
Under the hood, JoinGym simulates a query plan's cost by looking up intermediate result cardinalities from a pre-computed dataset.
arXiv Detail & Related papers (2023-07-21T17:00:06Z) - Let the Flows Tell: Solving Graph Combinatorial Optimization Problems
with GFlowNets [86.43523688236077]
Combinatorial optimization (CO) problems are often NP-hard and out of reach for exact algorithms.
GFlowNets have emerged as a powerful machinery to efficiently sample from composite unnormalized densities sequentially.
In this paper, we design Markov decision processes (MDPs) for different problems and propose to train conditional GFlowNets to sample from the solution space.
arXiv Detail & Related papers (2023-05-26T15:13:09Z) - A Reinforcement Learning Approach for Scheduling Problems With Improved
Generalization Through Order Swapping [0.0]
JSSP falls into the category of NP-hard COP, in which solving the problem through exhaustive search becomes unfeasible.
In recent years, the research towards using DRL to solve COP has gained interest and has shown promising results in terms of solution quality and computational efficiency.
In particular, we employ the PPO algorithm that adopts the policy-gradient paradigm that is found to perform well in the constrained dispatching of jobs.
arXiv Detail & Related papers (2023-02-27T16:45:04Z) - Multi-Task Off-Policy Learning from Bandit Feedback [54.96011624223482]
We propose a hierarchical off-policy optimization algorithm (HierOPO), which estimates the parameters of the hierarchical model and then acts pessimistically with respect to them.
We prove per-task bounds on the suboptimality of the learned policies, which show a clear improvement over not using the hierarchical model.
Our theoretical and empirical results show a clear advantage of using the hierarchy over solving each task independently.
arXiv Detail & Related papers (2022-12-09T08:26:27Z) - Decomposition Strategies and Multi-shot ASP Solving for Job-shop Scheduling [7.977161233209228]
Job-shop Scheduling Problem (JSP) is a well-known and challenging optimization problem in which tasks sharing a machine are to be arranged in a sequence such that encompassing jobs can be completed as early as possible.<n>In this paper, we investigate problem decomposition into time windows whose operations can be successively scheduled and optimized by means of multi-shot Answer Set Programming (ASP) solving.
arXiv Detail & Related papers (2022-05-16T09:33:00Z) - An actor-critic algorithm with policy gradients to solve the job shop
scheduling problem using deep double recurrent agents [1.3812010983144802]
We propose a deep reinforcement learning methodology for the job shop scheduling problem (JSSP)
The aim is to build up a greedy-like able to learn on some distribution of JSSP instances, different in the number of jobs and machines.
As expected, the model can generalize, to some extent, to larger problems or instances originated by a different distribution from the one used in training.
arXiv Detail & Related papers (2021-10-18T07:55:39Z) - Machine Learning for Online Algorithm Selection under Censored Feedback [71.6879432974126]
In online algorithm selection (OAS), instances of an algorithmic problem class are presented to an agent one after another, and the agent has to quickly select a presumably best algorithm from a fixed set of candidate algorithms.
For decision problems such as satisfiability (SAT), quality typically refers to the algorithm's runtime.
In this work, we revisit multi-armed bandit algorithms for OAS and discuss their capability of dealing with the problem.
We adapt them towards runtime-oriented losses, allowing for partially censored data while keeping a space- and time-complexity independent of the time horizon.
arXiv Detail & Related papers (2021-09-13T18:10:52Z) - Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models.
The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning.
We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z) - A global-local neighborhood search algorithm and tabu search for
flexible job shop scheduling problem [3.946442574906068]
This work presents a new meta-heuristic algorithm called GLNSA (Global-local neighborhood search algorithm)
The proposed algorithm is complemented with a tabu search that implements a simplified version of the Nopt1 neighborhood.
Experiments carried out show a satisfactory performance of the proposed algorithm, compared with other results published in recent algorithms.
arXiv Detail & Related papers (2020-10-23T23:08:51Z) - Dynamic Multi-Robot Task Allocation under Uncertainty and Temporal
Constraints [52.58352707495122]
We present a multi-robot allocation algorithm that decouples the key computational challenges of sequential decision-making under uncertainty and multi-agent coordination.
We validate our results over a wide range of simulations on two distinct domains: multi-arm conveyor belt pick-and-place and multi-drone delivery dispatch in a city.
arXiv Detail & Related papers (2020-05-27T01:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.