Related papers: iScheduler: Reinforcement Learning-Driven Continual Optimization for Large-Scale Resource Investment Problems

iScheduler: Reinforcement Learning-Driven Continual Optimization for Large-Scale Resource Investment Problems

URL: http://arxiv.org/abs/2602.06064v1
Date: Fri, 30 Jan 2026 11:20:58 GMT
Title: iScheduler: Reinforcement Learning-Driven Continual Optimization for Large-Scale Resource Investment Problems
Authors: Yi-Xiang Hu, Yuke Wang, Feng Wu, Zirui Huang, Shuli Zeng, Xiang-Yang Li,
Abstract summary: Scheduling precedence-constrained tasks under shared renewable resources is central to modern computing platforms.<n>We present iScheduler, a reinforcement-learning-driven iterative scheduling framework.<n>Experiments show that iScheduler attains competitive resource costs while reducing time to feasibility by up to 43$times$ against strong commercial baselines.
Score: 30.109981943437006
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Scheduling precedence-constrained tasks under shared renewable resources is central to modern computing platforms. The Resource Investment Problem (RIP) models this setting by minimizing the cost of provisioned renewable resources under precedence and timing constraints. Exact mixed-integer programming and constraint programming become impractically slow on large instances, and dynamic updates require schedule revisions under tight latency budgets. We present iScheduler, a reinforcement-learning-driven iterative scheduling framework that formulates RIP solving as a Markov decision process over decomposed subproblems and constructs schedules through sequential process selection. The framework accelerates optimization and supports reconfiguration by reusing unchanged process schedules and rescheduling only affected processes. We also release L-RIPLIB, an industrial-scale benchmark derived from cloud-platform workloads with 1,000 instances of 2,500-10,000 tasks. Experiments show that iScheduler attains competitive resource costs while reducing time to feasibility by up to 43$\times$ against strong commercial baselines.

Related papers

Bi-Level Online Provisioning and Scheduling with Switching Costs and Cross-Level Constraints [1.639795325203038]
We study a bi-level online provisioning and scheduling problem motivated by network resource allocation.<n>We model this two-time-scale interaction using an upper-level online convex optimization problem and a lower-level constrained Markov decision process.
arXiv Detail & Related papers (2026-01-26T20:16:13Z)
LeJOT: An Intelligent Job Cost Orchestration Solution for Databricks Platform [28.16213013287002]
We introduce LeJOT, an intelligent job cost orchestration framework for Databricks jobs.<n>LeJOT proactively predicts workload demands, dynamically allocates computing resources, and minimizes costs.<n>We show that LeJOT achieves an average 20% reduction in cloud computing costs within a minute-level scheduling timeframe.
arXiv Detail & Related papers (2025-12-20T08:09:58Z)
FairBatching: Fairness-Aware Batch Formation for LLM Inference [2.0917668141703207]
This work identifies the root cause of this unfairness: the non-monotonic nature of Time--Tokens (TBT)<n>We propose Fair the Prioritizing, a novel system that enforces fair resource allocation between fill and decode tasks.
arXiv Detail & Related papers (2025-10-16T07:43:56Z)
CSGO: Generalized Optimization for Cold Start in Wireless Collaborative Edge LLM Systems [62.24576366776727]
We propose a latency-aware scheduling framework to minimize total inference latency.<n>We show that the proposed method significantly reduces cold-start latency compared to baseline strategies.
arXiv Detail & Related papers (2025-08-15T07:49:22Z)
Scalable Chain of Thoughts via Elastic Reasoning [61.75753924952059]
Elastic Reasoning is a novel framework for scalable chain of thoughts.<n>It separates reasoning into two phases--thinking and solution--with independently allocated budgets.<n>Our approach produces more concise and efficient reasoning even in unconstrained settings.
arXiv Detail & Related papers (2025-05-08T15:01:06Z)
Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute [61.00662702026523]
We propose a unified Test-Time Compute scaling framework that leverages increased inference-time instead of larger models.<n>Our framework incorporates two complementary strategies: internal TTC and external TTC.<n>We demonstrate our textbf32B model achieves a 46% issue resolution rate, surpassing significantly larger models such as DeepSeek R1 671B and OpenAI o1.
arXiv Detail & Related papers (2025-03-31T07:31:32Z)
OmniRouter: Budget and Performance Controllable Multi-LLM Routing [31.60019342381251]
Large language models (LLMs) deliver superior performance but require substantial computational resources and operate with relatively low efficiency.<n>We introduce Omni, a controllable routing framework for multi-LLM serving.<n>Experiments show that Omni achieves up to 6.30% improvement in response accuracy while simultaneously reducing computational costs by at least 10.15%.
arXiv Detail & Related papers (2025-02-27T22:35:31Z)
Dynamic Scheduling for Federated Edge Learning with Streaming Data [56.91063444859008]
We consider a Federated Edge Learning (FEEL) system where training data are randomly generated over time at a set of distributed edge devices with long-term energy constraints. Due to limited communication resources and latency requirements, only a subset of devices is scheduled for participating in the local training process in every iteration.
arXiv Detail & Related papers (2023-05-02T07:41:16Z)
Generating Dispatching Rules for the Interrupting Swap-Allowed Blocking Job Shop Problem Using Graph Neural Network and Reinforcement Learning [21.021840570685264]
The interrupting swap-allowed blocking job shop problem (ISBJSSP) is able to model many manufacturing planning and logistics applications realistically. We introduce a dynamic disjunctive graph formulation characterized by nodes and edges subjected to continuous deletions and additions. A simulator is developed to simulate interruption, swapping, and blocking in the ISBJSSP setting.
arXiv Detail & Related papers (2023-02-05T23:35:21Z)
Actively Learning Costly Reward Functions for Reinforcement Learning [56.34005280792013]
We show that it is possible to train agents in complex real-world environments orders of magnitudes faster. By enabling the application of reinforcement learning methods to new domains, we show that we can find interesting and non-trivial solutions.
arXiv Detail & Related papers (2022-11-23T19:17:20Z)
MetaNet: Automated Dynamic Selection of Scheduling Policies in Cloud Environments [13.864161788250856]
This work aims to solve the non-trivial meta problem of online dynamic selection of a scheduling policy using a surrogate model called MetaNet. Compared to state-of-the-art DNN schedulers, this allows for improvement in execution costs, energy consumption, response time and service level agreement violations by up to 11, 43, 8 and 13 percent, respectively.
arXiv Detail & Related papers (2022-05-21T16:51:51Z)
Combining Deep Learning and Optimization for Security-Constrained Optimal Power Flow [94.24763814458686]
Security-constrained optimal power flow (SCOPF) is fundamental in power systems. Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs. This paper proposes a novel approach that combines deep learning and robust optimization techniques.
arXiv Detail & Related papers (2020-07-14T12:38:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.