Minimizing the Weighted Number of Tardy Jobs: Data-Driven Heuristic for Single-Machine Scheduling
- URL: http://arxiv.org/abs/2508.13703v2
- Date: Tue, 07 Oct 2025 11:41:19 GMT
- Title: Minimizing the Weighted Number of Tardy Jobs: Data-Driven Heuristic for Single-Machine Scheduling
- Authors: Nikolai Antonov, Prěmysl Šůcha, Mikoláš Janota, Jan Hůla,
- Abstract summary: We focus on a single-machine scheduling problem where each job is defined by its weight, duration, due date, and deadline.<n>We introduce a novel data-driven scheduling that combines machine learning with problem-specific characteristics.<n> Experimental results demonstrate that our approach significantly outperforms the state-of-the-art in terms of optimality gap, number of optimal solutions, and adaptability.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing research on single-machine scheduling is largely focused on exact algorithms, which perform well on typical instances but can significantly deteriorate on certain regions of the problem space. In contrast, data-driven approaches provide strong and scalable performance when tailored to the structure of specific datasets. Leveraging this idea, we focus on a single-machine scheduling problem where each job is defined by its weight, duration, due date, and deadline, aiming to minimize the total weight of tardy jobs. We introduce a novel data-driven scheduling heuristic that combines machine learning with problem-specific characteristics, ensuring feasible solutions, which is a common challenge for ML-based algorithms. Experimental results demonstrate that our approach significantly outperforms the state-of-the-art in terms of optimality gap, number of optimal solutions, and adaptability across varied data scenarios, highlighting its flexibility for practical applications. In addition, we conduct a systematic exploration of ML models, addressing a common gap in similar studies by offering a detailed model selection process and providing insights into why the chosen model is the best fit.
Related papers
- Learning-Augmented Moment Estimation on Time-Decay Models [55.06256430461023]
We use an oracle for the heavy-hitters of datasets to give learning-augmented algorithms for a number of fundamental problems.<n>We complement our theoretical results with a number of empirical evaluations that demonstrate the practical efficiency of our algorithms on real and synthetic datasets.
arXiv Detail & Related papers (2026-03-03T00:42:34Z) - Quantum-classical hybrid algorithm using quantum annealing for multi-objective job shop scheduling [0.5872014229110214]
Efficient production planning is essential in modern manufacturing to improve performance indicators such as lead time and to reduce reliance on human intuition.<n>We develop a quantum-classical hybrid algorithm that decomposes the problem into two subproblems: resource allocation and task scheduling.<n> Experimental results demonstrate that our hybrid approach achieves superior solution quality and computational efficiency compared to traditional monolithic methods.
arXiv Detail & Related papers (2025-11-05T07:39:09Z) - Forecasting Outside the Box: Application-Driven Optimal Pointwise Forecasts for Stochastic Optimization [0.0]
We present an integrated learning and optimization procedure that yields the best approximation of an unknown situation.
Numerical results conducted with inventory problems from the literature as well as a bike-sharing problem with real data demonstrate that the proposed approach performs well.
arXiv Detail & Related papers (2024-11-05T21:54:50Z) - A CLIP-Powered Framework for Robust and Generalizable Data Selection [51.46695086779598]
Real-world datasets often contain redundant and noisy data, imposing a negative impact on training efficiency and model performance.<n>Data selection has shown promise in identifying the most representative samples from the entire dataset.<n>We propose a novel CLIP-powered data selection framework that leverages multimodal information for more robust and generalizable sample selection.
arXiv Detail & Related papers (2024-10-15T03:00:58Z) - Align Your Steps: Optimizing Sampling Schedules in Diffusion Models [63.927438959502226]
Diffusion models (DMs) have established themselves as the state-of-the-art generative modeling approach in the visual domain and beyond.
A crucial drawback of DMs is their slow sampling speed, relying on many sequential function evaluations through large neural networks.
We propose a general and principled approach to optimizing the sampling schedules of DMs for high-quality outputs.
arXiv Detail & Related papers (2024-04-22T18:18:41Z) - Characterization of Large Language Model Development in the Datacenter [55.9909258342639]
Large Language Models (LLMs) have presented impressive performance across several transformative tasks.
However, it is non-trivial to efficiently utilize large-scale cluster resources to develop LLMs.
We present an in-depth characterization study of a six-month LLM development workload trace collected from our GPU datacenter Acme.
arXiv Detail & Related papers (2024-03-12T13:31:14Z) - Towards Accelerated Model Training via Bayesian Data Selection [45.62338106716745]
We propose a more reasonable data selection principle by examining the data's impact on the model's generalization loss.
Recent work has proposed a more reasonable data selection principle by examining the data's impact on the model's generalization loss.
This work solves these problems by leveraging a lightweight Bayesian treatment and incorporating off-the-shelf zero-shot predictors built on large-scale pre-trained models.
arXiv Detail & Related papers (2023-08-21T07:58:15Z) - Minimizing Entropy to Discover Good Solutions to Recurrent Mixed Integer
Programs [0.0]
Current solvers for mixed-integer programming (MIP) problems are designed to perform well on a wide range of problems.
Recent works have shown that machine learning (ML) can be integrated with an MIP solver to inject domain knowledge and efficiently close the optimality gap.
This paper proposes an online solver that uses the notion of entropy to efficiently build a model with minimal training data and tuning.
arXiv Detail & Related papers (2022-02-07T18:52:56Z) - Learning to Schedule Heuristics in Branch-and-Bound [25.79025327341732]
Real-world applications typically require finding good solutions early in the search to enable fast decision-making.
We propose the first data-driven framework for schedulings in an exact MIP solver.
Compared to the default settings of a state-of-the-art academic MIP solver, we are able to reduce the average primal integral by up to 49% on a class of challenging instances.
arXiv Detail & Related papers (2021-03-18T14:49:52Z) - Combining Deep Learning and Optimization for Security-Constrained
Optimal Power Flow [94.24763814458686]
Security-constrained optimal power flow (SCOPF) is fundamental in power systems.
Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs.
This paper proposes a novel approach that combines deep learning and robust optimization techniques.
arXiv Detail & Related papers (2020-07-14T12:38:21Z) - Data-driven Algorithm for Scheduling with Total Tardiness [0.6606016007748989]
We investigate the use of deep learning for solving a classical NP-Hard single machine scheduling problem.
We have designed a regressor containing a deep neural network that learns and predicts the criterion of a given set of jobs.
Our data-driven approach can efficiently generalize information from the training phase to significantly larger instances.
arXiv Detail & Related papers (2020-05-12T07:16:43Z) - Improving a State-of-the-Art Heuristic for the Minimum Latency Problem
with Data Mining [69.00394670035747]
Hybrid metaheuristics have become a trend in operations research.
A successful example combines the Greedy Randomized Adaptive Search Procedures (GRASP) and data mining techniques.
arXiv Detail & Related papers (2019-08-28T13:12:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.