Learning Memory-Enhanced Improvement Heuristics for Flexible Job Shop Scheduling
- URL: http://arxiv.org/abs/2603.02846v1
- Date: Tue, 03 Mar 2026 10:43:01 GMT
- Title: Learning Memory-Enhanced Improvement Heuristics for Flexible Job Shop Scheduling
- Authors: Jiaqi Wang, Zhiguang Cao, Peng Zhao, Rui Cao, Yubin Xiao, Yuan Jiang, You Zhou,
- Abstract summary: The flexible job-shop scheduling problem (FJSP) has attracted significant attention due to its complex and strong alignment with real-world production scenarios.<n>Current deep reinforcement learning (DRL)-based approaches to FJSP predominantly employ constructive methods.<n>This paper proposes a Memory-enhanced Improvement Search framework with heterogeneous graph representation--MIStar.
- Score: 39.98859285173431
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The rise of smart manufacturing under Industry 4.0 introduces mass customization and dynamic production, demanding more advanced and flexible scheduling techniques. The flexible job-shop scheduling problem (FJSP) has attracted significant attention due to its complex constraints and strong alignment with real-world production scenarios. Current deep reinforcement learning (DRL)-based approaches to FJSP predominantly employ constructive methods. While effective, they often fall short of reaching (near-)optimal solutions. In contrast, improvement-based methods iteratively explore the neighborhood of initial solutions and are more effective in approaching optimality. However, the flexible machine allocation in FJSP poses significant challenges to the application of this framework, including accurate state representation, effective policy learning, and efficient search strategies. To address these challenges, this paper proposes a Memory-enhanced Improvement Search framework with heterogeneous graph representation--MIStar. It employs a novel heterogeneous disjunctive graph that explicitly models the operation sequences on machines to accurately represent scheduling solutions. Moreover, a memoryenhanced heterogeneous graph neural network (MHGNN) is designed for feature extraction, leveraging historical trajectories to enhance the decision-making capability of the policy network. Finally, a parallel greedy search strategy is adopted to explore the solution space, enabling superior solutions with fewer iterations. Extensive experiments on synthetic data and public benchmarks demonstrate that MIStar significantly outperforms both traditional handcrafted improvement heuristics and state-of-the-art DRL-based constructive methods.
Related papers
- Hierarchical Optimization via LLM-Guided Objective Evolution for Mobility-on-Demand Systems [9.979671028876464]
We propose a novel framework that integrates large language model (LLM) with mathematical optimization in a dynamic hierarchical system.<n>Within this framework, LLM serves as a meta-optimizer, producing semantics that guide a low-level responsible for constraint enforcement and real-time decision execution.<n>Experiments based on both the New York and Chicago taxi datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2025-10-12T14:56:19Z) - SAGE: Strategy-Adaptive Generation Engine for Query Rewriting [8.941793732446856]
We introduce the Strategy-Adaptive Generation Engine (SAGE), which operationalizes expert-crafted strategies in an reinforcement learning framework.<n>SAGE achieves new state-of-the-art NDCG@10 results, but also uncovers a compelling emergent behavior.<n>Our findings demonstrate that strategy-guided RL, enhanced with nuanced reward shaping, offers a scalable, efficient, and more interpretable paradigm for developing the next generation of robust information retrieval systems.
arXiv Detail & Related papers (2025-06-24T16:50:51Z) - Preference Optimization for Combinatorial Optimization Problems [54.87466279363487]
Reinforcement Learning (RL) has emerged as a powerful tool for neural optimization, enabling models learns that solve complex problems without requiring expert knowledge.<n>Despite significant progress, existing RL approaches face challenges such as diminishing reward signals and inefficient exploration in vast action spaces.<n>We propose Preference Optimization, a novel method that transforms quantitative reward signals into qualitative preference signals via statistical comparison modeling.
arXiv Detail & Related papers (2025-05-13T16:47:00Z) - Lightweight and Direct Document Relevance Optimization for Generative Information Retrieval [49.669503570350166]
Generative information retrieval (GenIR) is a promising neural retrieval paradigm that formulates document retrieval as a document identifier (docid) generation task.<n>Existing GenIR models suffer from token-level misalignment, where models trained to predict the next token often fail to capture document-level relevance effectively.<n>We propose direct document relevance optimization (DDRO), which aligns token-level docid generation with document-level relevance estimation through direct optimization via pairwise ranking.
arXiv Detail & Related papers (2025-04-07T15:27:37Z) - A Survey on Inference Optimization Techniques for Mixture of Experts Models [50.40325411764262]
Large-scale Mixture of Experts (MoE) models offer enhanced model capacity and computational efficiency through conditional computation.<n> deploying and running inference on these models presents significant challenges in computational resources, latency, and energy efficiency.<n>This survey analyzes optimization techniques for MoE models across the entire system stack.
arXiv Detail & Related papers (2024-12-18T14:11:15Z) - Offline reinforcement learning for job-shop scheduling problems [1.3927943269211593]
This paper introduces a novel offline RL method designed for optimization problems with complex constraints.<n>Our approach encodes actions in edge attributes and balances expected rewards with the imitation of expert solutions.<n>We demonstrate the effectiveness of this method on job-shop scheduling and flexible job-shop scheduling benchmarks.
arXiv Detail & Related papers (2024-10-21T07:33:42Z) - Memory-Enhanced Neural Solvers for Routing Problems [8.255381359612885]
We present MEMENTO, an approach that leverages memory to improve the search of neural solvers at inference.<n>We validate its effectiveness on the Traveling Salesman and Capacitated Vehicle Routing problems, demonstrating its superiority over tree-search and policy-gradient fine-tuning.<n>We successfully train all RL auto-regressive solvers on large instances, and verify MEMENTO's scalability and data-efficiency.
arXiv Detail & Related papers (2024-06-24T08:18:19Z) - Learning-enabled Flexible Job-shop Scheduling for Scalable Smart
Manufacturing [11.509669981978874]
In smart manufacturing systems, flexible job-shop scheduling with transportation constraints is essential to optimize solutions for maximizing productivity.
Recent developments in deep reinforcement learning (DRL)-based methods for FJSPT have encountered a scale generalization challenge.
We introduce a novel graph-based DRL method, named the Heterogeneous Graph Scheduler (HGS)
arXiv Detail & Related papers (2024-02-14T06:49:23Z) - Machine Learning Insides OptVerse AI Solver: Design Principles and
Applications [74.67495900436728]
We present a comprehensive study on the integration of machine learning (ML) techniques into Huawei Cloud's OptVerse AI solver.
We showcase our methods for generating complex SAT and MILP instances utilizing generative models that mirror multifaceted structures of real-world problem.
We detail the incorporation of state-of-the-art parameter tuning algorithms which markedly elevate solver performance.
arXiv Detail & Related papers (2024-01-11T15:02:15Z) - Deep Reinforcement Learning Guided Improvement Heuristic for Job Shop
Scheduling [30.45126420996238]
This paper proposes a novel DRL-guided improvement for solving JSSP, where graph representation is employed to encode complete solutions.
We design a Graph Neural-Network-based representation scheme, consisting of two modules to effectively capture the information of dynamic topology and different types of nodes in graphs encountered during the improvement process.
We prove that our method scales linearly with problem size. Experiments on classic benchmarks show that the improvement policy learned by our method outperforms state-of-the-art DRL-based methods by a large margin.
arXiv Detail & Related papers (2022-11-20T10:20:13Z) - Revisiting GANs by Best-Response Constraint: Perspective, Methodology,
and Application [49.66088514485446]
Best-Response Constraint (BRC) is a general learning framework to explicitly formulate the potential dependency of the generator on the discriminator.
We show that even with different motivations and formulations, a variety of existing GANs ALL can be uniformly improved by our flexible BRC methodology.
arXiv Detail & Related papers (2022-05-20T12:42:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.