Related papers: DScheLLM: Enabling Dynamic Scheduling through a Fine-Tuned Dual-System Large language Model

DScheLLM: Enabling Dynamic Scheduling through a Fine-Tuned Dual-System Large language Model

URL: http://arxiv.org/abs/2601.09100v2
Date: Thu, 15 Jan 2026 02:37:44 GMT
Title: DScheLLM: Enabling Dynamic Scheduling through a Fine-Tuned Dual-System Large language Model
Authors: Lixiang Zhang, Chenggong Zhao, Qing Gao, Xiaoke Zhao, Gengyi Bai, Jinhu Lv,
Abstract summary: This paper proposes DScheLLM, a dynamic scheduling approach that leverages fine-tuned large language models within a dual-system (fast-slow) reasoning architecture.<n>A unified large language model-based framework is constructed to handle dynamic events, where training datasets for both fast and slow reasoning modes are generated.<n> Experimental evaluations on standard job shop scheduling benchmarks demonstrate that the fast-thinking mode can efficiently generate high-quality schedules.
Score: 2.9367859148626945
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Production scheduling is highly susceptible to dynamic disruptions, such as variations in processing times, machine availability, and unexpected task insertions. Conventional approaches typically rely on event-specific models and explicit analytical formulations, which limits their adaptability and generalization across previously unseen disturbances. To overcome these limitations, this paper proposes DScheLLM, a dynamic scheduling approach that leverages fine-tuned large language models within a dual-system (fast-slow) reasoning architecture to address disturbances of different scales. A unified large language model-based framework is constructed to handle dynamic events, where training datasets for both fast and slow reasoning modes are generated using exact schedules obtained from an operations research solver. The Huawei OpenPangu Embedded-7B model is subsequently fine-tuned under the hybrid reasoning paradigms using LoRA. Experimental evaluations on standard job shop scheduling benchmarks demonstrate that the fast-thinking mode can efficiently generate high-quality schedules and the slow-thinking mode can produce solver-compatible and well-formatted decision inputs. To the best of our knowledge, this work represents one of the earliest studies applying large language models to job shop scheduling in dynamic environments, highlighting their considerable potential for intelligent and adaptive scheduling optimization.

Related papers

Multimodal Large Language Models with Adaptive Preference Optimization for Sequential Recommendation [60.33386541343322]
We propose a Multimodal Large Language Models framework that integrates Hardness-aware and Noise-regularized preference optimization for Recommendation (HaNoRec)<n>Specifically, HaNoRec dynamically adjusts optimization weights based on both the estimated hardness of each training sample and the policy model's real-time responsiveness.
arXiv Detail & Related papers (2025-11-24T04:10:46Z)
TableMind: An Autonomous Programmatic Agent for Tool-Augmented Table Reasoning [10.267950603662776]
TableMind is a tool-integrated table reasoning agent that autonomously performs multi-turn tool invocation, writes and executes code in a secure sandbox environment for data analysis and precise numerical reasoning.<n>To realize these capabilities, we adopt a two-stage fine-tuning paradigm built on top of a powerful pre-trained language model.
arXiv Detail & Related papers (2025-09-08T02:00:31Z)
Large Language Model-Based Automatic Formulation for Stochastic Optimization Models [0.0]
This paper presents the first integrated systematic study on the performance of large language models (LLMs)<n>We design several prompts that guide ChatGPT through structured tasks using chain-of- thought and modular reasoning.<n>Across a diverse set of problems, GPT-4-Turbo outperforms other models in partial score, variable matching, objective accuracy, with cot_s_instructions emerging as the most effective prompting strategies.
arXiv Detail & Related papers (2025-08-24T03:31:25Z)
KAT-V1: Kwai-AutoThink Technical Report [50.84483585850113]
We present Kwaipilot-AutoThink (KAT), an open-source 40B large language model developed to address the overthinking problem in reasoning-intensive tasks.<n>KAT dynamically switches between reasoning and non-reasoning modes based on task complexity.<n>We also propose Step-SRPO, a reinforcement learning algorithm that incorporates intermediate supervision into the GRPO framework.
arXiv Detail & Related papers (2025-07-11T04:07:10Z)
Evaluating the Efficacy of LLM-Based Reasoning for Multiobjective HPC Job Scheduling [6.375075345747834]
Large Language Model (LLM)-based scheduler using ReAct-style framework (Reason + Act)<n>System incorporates a scratchpad memory to track scheduling history and refine decisions via natural language feedback.<n>We evaluate our approach using OpenAI's O4-Mini and Anthropic's Claude 3.7 across seven real-world HPC workload scenarios.
arXiv Detail & Related papers (2025-05-29T14:25:29Z)
Pangu Embedded: An Efficient Dual-system LLM Reasoner with Metacognition [95.54406667705999]
Pangu Embedded is an efficient Large Language Model (LLM) reasoner developed on Ascend Neural Processing Units (NPUs)<n>It addresses the significant computational costs and inference latency challenges prevalent in existing reasoning-optimized LLMs.<n>It delivers rapid responses and state-of-the-art reasoning quality within a single, unified model architecture.
arXiv Detail & Related papers (2025-05-28T14:03:02Z)
Leveraging Importance Sampling to Detach Alignment Modules from Large Language Models [48.15777554876988]
Traditional alignment methods often require retraining large pretrained models.<n>We propose a novel textitResidual Alignment Model (textitRAM) that formalizes the alignment process as a type of importance sampling.<n>We develop a resampling algorithm with iterative token-level decoding to address the common first-token latency issue in comparable methods.
arXiv Detail & Related papers (2025-05-26T08:53:02Z)
Dipper: Diversity in Prompts for Producing Large Language Model Ensembles in Reasoning tasks [77.40114523163892]
DIPPER is a training-free framework that transforms a single Large Language Models (LLMs) into an effective inference-time ensemble.<n>By feeding the model an optimized and diverse set of prompts in parallel, DIPPER elicits varied reasoning paths, leading to performance gains.<n>We empirically demonstrate significant improvements on reasoning benchmarks, such as MATH, where a DIPPER ensemble of three Qwen2-MATH-1.5B instances outperforms a larger 7B model.
arXiv Detail & Related papers (2024-12-12T17:49:05Z)
Unlocking Reasoning Potential in Large Langauge Models by Scaling Code-form Planning [94.76546523689113]
We introduce CodePlan, a framework that generates and follows textcode-form plans -- pseudocode that outlines high-level, structured reasoning processes. CodePlan effectively captures the rich semantics and control flows inherent to sophisticated reasoning tasks. It achieves a 25.1% relative improvement compared with directly generating responses.
arXiv Detail & Related papers (2024-09-19T04:13:58Z)
Automated Conversion of Static to Dynamic Scheduler via Natural Language [3.4748713192043876]
We propose a Retrieval-Augmented Generation (RAG) based LLM model to automate the process of implementing constraints for Dynamic Scheduling (RAGDyS) Our framework aims to minimize technical complexities related to mathematical modelling and computational workload for end-users.
arXiv Detail & Related papers (2024-05-08T04:07:38Z)
Entropy-Regularized Token-Level Policy Optimization for Language Agent Reinforcement [67.1393112206885]
Large Language Models (LLMs) have shown promise as intelligent agents in interactive decision-making tasks. We introduce Entropy-Regularized Token-level Policy Optimization (ETPO), an entropy-augmented RL method tailored for optimizing LLMs at the token level. We assess the effectiveness of ETPO within a simulated environment that models data science code generation as a series of multi-step interactive tasks.
arXiv Detail & Related papers (2024-02-09T07:45:26Z)
Using Differentiable Programming for Flexible Statistical Modeling [0.0]
Differentiable programming has recently received much interest as a paradigm that facilitates taking gradients of computer programs. We show how differentiable programming can enable simple gradient-based optimization of a model by automatic differentiation. This allowed us to quickly prototype a model under time pressure that outperforms simpler benchmark models.
arXiv Detail & Related papers (2020-12-07T12:33:49Z)
Conditional Generative Modeling via Learning the Latent Space [54.620761775441046]
We propose a novel framework for conditional generation in multimodal spaces. It uses latent variables to model generalizable learning patterns. At inference, the latent variables are optimized to find optimal solutions corresponding to multiple output modes.
arXiv Detail & Related papers (2020-10-07T03:11:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.