Improving After-sales Service: Deep Reinforcement Learning for Dynamic Time Slot Assignment with Commitments and Customer Preferences
- URL: http://arxiv.org/abs/2509.17870v1
- Date: Mon, 22 Sep 2025 15:09:39 GMT
- Title: Improving After-sales Service: Deep Reinforcement Learning for Dynamic Time Slot Assignment with Commitments and Customer Preferences
- Authors: Xiao Mao, Albert H. Schrotenboer, Guohua Wu, Willem van Jaarsveld,
- Abstract summary: High-tech maintenance is a strategic component in after-sales services, involving close coordination between customers and service engineers.<n>We study this hierarchical and sequential decision-making problem-the Dynamic Time Slot Assignment Problem with Commitments and Customer Preferences (DTSAP-CCP)<n>Two distinct approaches are proposed: 1) an attention-based deep reinforcement learning with rollout execution (ADRL-RE) and 2) a scenario-based planning approach (SBP)
- Score: 5.093728036666825
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Problem definition: For original equipment manufacturers (OEMs), high-tech maintenance is a strategic component in after-sales services, involving close coordination between customers and service engineers. Each customer suggests several time slots for their maintenance task, from which the OEM must select one. This decision needs to be made promptly to support customers' planning. At the end of each day, routes for service engineers are planned to fulfill the tasks scheduled for the following day. We study this hierarchical and sequential decision-making problem-the Dynamic Time Slot Assignment Problem with Commitments and Customer Preferences (DTSAP-CCP)-in this paper. Methodology/results: Two distinct approaches are proposed: 1) an attention-based deep reinforcement learning with rollout execution (ADRL-RE) and 2) a scenario-based planning approach (SBP). The ADRL-RE combines a well-trained attention-based neural network with a rollout framework for online trajectory simulation. To support the training, we develop a neural heuristic solver that provides rapid route planning solutions, enabling efficient learning in complex combinatorial settings. The SBP approach samples several scenarios to guide the time slot assignment. Numerical experiments demonstrate the superiority of ADRL-RE and the stability of SBP compared to both rule-based and rollout-based approaches. Furthermore, the strong practicality of ADRL-RE is verified in a case study of after-sales service for large medical equipment. Implications: This study provides OEMs with practical decision-support tools for dynamic maintenance scheduling, balancing customer preferences and operational efficiency. In particular, our ADRL-RE shows strong real-world potential, supporting timely and customer-aligned maintenance scheduling.
Related papers
- AceGRPO: Adaptive Curriculum Enhanced Group Relative Policy Optimization for Autonomous Machine Learning Engineering [52.67783579040657]
AceGRPO is a machine learning system that prioritizes tasks at the agent's learning frontier to maximize learning efficiency.<n>Our trained Ace-30B model achieves a 100% valid submission rate on MLE-Bench-Lite, approaches the performance of proprietary frontier models, and outperforms larger open-source baselines.
arXiv Detail & Related papers (2026-02-08T10:55:03Z) - Optimizing Fairness in Production Planning: A Human-Centric Approach to Machine and Workforce Allocation [55.71151342699622]
The proposed system is validated through 16 test sessions with domain experts from the automotive industry.<n>Results indicate that the CP-based scheduling approach produces compact, feasible production plans with low tardiness.
arXiv Detail & Related papers (2025-10-01T16:41:18Z) - Learning When to Plan: Efficiently Allocating Test-Time Compute for LLM Agents [35.79575378215309]
Training large language models (LLMs) to reason via reinforcement learning (RL) significantly improves their problem-solving capabilities.<n>We introduce a conceptual framework formalizing dynamic planning for LLM agents, enabling them to flexibly decide when to allocate test-time compute for planning.<n>Experiments on the Crafter environment show that dynamic planning agents trained with this approach are more sample-efficient and consistently achieve more complex objectives.
arXiv Detail & Related papers (2025-09-03T18:00:13Z) - A Production Scheduling Framework for Reinforcement Learning Under Real-World Constraints [0.0]
Real-world production environments introduce additional complexities that cause traditional scheduling approaches to be less effective.<n>Reinforcement learning (RL) holds potential in addressing these challenges, as it allows agents to learn adaptive scheduling strategies.<n>We propose a modular framework that extends classical JSSP formulations by incorporating key real-world constraints.<n>JobShopLab is an open-source tool for both research and industrial applications.
arXiv Detail & Related papers (2025-06-16T14:50:26Z) - Dependency-Aware CAV Task Scheduling via Diffusion-Based Reinforcement Learning [12.504232513881828]
We propose a novel dependency-aware task scheduling strategy for dynamic unmanned aerial vehicle-assisted connected autonomous vehicles (CAVs)<n>We formulate a joint scheduling priority and subtask assignment optimization problem with the objective of minimizing the average task completion time.<n>We propose a diffusion-based reinforcement learning algorithm, named Synthetic DDQN based Subtasks Scheduling, which can make adaptive task scheduling decision in real time.
arXiv Detail & Related papers (2024-11-27T11:07:31Z) - Agent-Oriented Planning in Multi-Agent Systems [54.429028104022066]
We propose AOP, a novel framework for agent-oriented planning in multi-agent systems.<n>In this study, we identify three critical design principles of agent-oriented planning, including solvability, completeness, and non-redundancy.<n> Extensive experiments demonstrate the advancement of AOP in solving real-world problems compared to both single-agent systems and existing planning strategies for multi-agent systems.
arXiv Detail & Related papers (2024-10-03T04:07:51Z) - Optimizing Job Shop Scheduling in the Furniture Industry: A Reinforcement Learning Approach Considering Machine Setup, Batch Variability, and Intralogistics [0.0]
This paper explores the potential application of Deep Reinforcement Learning in the furniture industry.
A concept for a model is proposed that provides a higher level of information detail to enhance scheduling accuracy and efficiency.
The model extends traditional approaches to JSSPs by including job volumes, buffer management, transportation times, and machine setup times.
arXiv Detail & Related papers (2024-09-18T09:12:40Z) - Accelerate Presolve in Large-Scale Linear Programming via Reinforcement
Learning [92.31528918811007]
We propose a simple and efficient reinforcement learning framework -- namely, reinforcement learning for presolve (RL4Presolve) -- to tackle (P1)-(P3) simultaneously.
Experiments on two solvers and eight benchmarks (real-world and synthetic) demonstrate that RL4Presolve significantly and consistently improves the efficiency of solving large-scale LPs.
arXiv Detail & Related papers (2023-10-18T09:51:59Z) - Flexible Job Shop Scheduling via Dual Attention Network Based
Reinforcement Learning [73.19312285906891]
In flexible job shop scheduling problem (FJSP), operations can be processed on multiple machines, leading to intricate relationships between operations and machines.
Recent works have employed deep reinforcement learning (DRL) to learn priority dispatching rules (PDRs) for solving FJSP.
This paper presents a novel end-to-end learning framework that weds the merits of self-attention models for deep feature extraction and DRL for scalable decision-making.
arXiv Detail & Related papers (2023-05-09T01:35:48Z) - A Memetic Algorithm with Reinforcement Learning for Sociotechnical
Production Scheduling [0.0]
This article presents a memetic algorithm with applying deep reinforcement learning (DRL) to flexible job shop scheduling problems (DRC-FJSSP)
From research projects in industry, we recognize the need to consider flexible machines, flexible human workers, worker capabilities, setup and processing operations, material arrival times, complex job paths with parallel tasks for bill of material manufacturing, sequence-dependent setup times and (partially) automated tasks in human-machine-collaboration.
arXiv Detail & Related papers (2022-12-21T11:24:32Z) - DL-DRL: A double-level deep reinforcement learning approach for
large-scale task scheduling of multi-UAV [65.07776277630228]
We propose a double-level deep reinforcement learning (DL-DRL) approach based on a divide and conquer framework (DCF)
Particularly, we design an encoder-decoder structured policy network in our upper-level DRL model to allocate the tasks to different UAVs.
We also exploit another attention based policy network in our lower-level DRL model to construct the route for each UAV, with the objective to maximize the number of executed tasks.
arXiv Detail & Related papers (2022-08-04T04:35:53Z) - Discovery of Options via Meta-Learned Subgoals [59.2160583043938]
Temporal abstractions in the form of options have been shown to help reinforcement learning (RL) agents learn faster.
We introduce a novel meta-gradient approach for discovering useful options in multi-task RL environments.
arXiv Detail & Related papers (2021-02-12T19:50:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.