Related papers: R-ConstraintBench: Evaluating LLMs on NP-Complete Scheduling

R-ConstraintBench: Evaluating LLMs on NP-Complete Scheduling

URL: http://arxiv.org/abs/2508.15204v1
Date: Thu, 21 Aug 2025 03:35:58 GMT
Title: R-ConstraintBench: Evaluating LLMs on NP-Complete Scheduling
Authors: Raj Jain, Marc Wetter,
Abstract summary: We present R-ConstraintBench, a framework that evaluates models on Resource-Constrained Project Scheduling Problems (RCPSP)<n>We instantiate the benchmark in a data center migration setting and evaluate multiple LLMs using feasibility and error analysis.<n> Empirically, strong models are near-ceiling on precedence-only DAGs, but feasibility performance collapses when downtime, temporal windows, and disjunctive constraints interact.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Effective scheduling under tight resource, timing, and operational constraints underpins large-scale planning across sectors such as capital projects, manufacturing, logistics, and IT fleet transitions. However, the reliability of large language models (LLMs) when reasoning under high-constraint regimes is insufficiently characterized. To address this gap, we present R-ConstraintBench, a scalable framework that evaluates models on Resource-Constrained Project Scheduling Problems (RCPSP), an NP-Complete feasibility class, while difficulty increases via linear growth in constraints. R-ConstraintBench incrementally increases non-redundant precedence constraints in Directed Acyclic Graphs (DAGs) and then introduces downtime, temporal windows, and disjunctive constraints. As an illustrative example, we instantiate the benchmark in a data center migration setting and evaluate multiple LLMs using feasibility and error analysis, identifying degradation thresholds and constraint types most associated with failure. Empirically, strong models are near-ceiling on precedence-only DAGs, but feasibility performance collapses when downtime, temporal windows, and disjunctive constraints interact, implicating constraint interaction, not graph depth, as the principal bottleneck. Performance on clean synthetic ramps also does not guarantee transfer to domain-grounded scenarios, underscoring limited generalization.

Related papers

Adaptive Neighborhood-Constrained Q Learning for Offline Reinforcement Learning [52.03884701766989]
offline reinforcement learning (RL) algorithms typically impose constraints on action selection.<n>We propose a new neighborhood constraint that restricts action selection in the Bellman target to the union of neighborhoods of dataset actions.<n>We develop a simple yet effective algorithm, Adaptive Neighborhood-constrained Q learning (ANQ), to perform Q learning with target actions satisfying this constraint.
arXiv Detail & Related papers (2025-11-04T13:42:05Z)
Efficient Edge Test-Time Adaptation via Latent Feature Coordinate Correction [43.48832321879385]
We propose a novel test-time adaptation (TTA) method tailored for edge devices (TED)<n>TED employs forward-only coordinate optimization in the principal subspace of latent using the covariance matrix adaptation evolution strategy (CMA-ES)<n>TED achieves state-of-the-art performance while $textitreducing computational complexity by up to 63 times$, offering a practical and scalable solution for real-world edge applications.
arXiv Detail & Related papers (2025-10-13T07:08:52Z)
CSGO: Generalized Optimization for Cold Start in Wireless Collaborative Edge LLM Systems [62.24576366776727]
We propose a latency-aware scheduling framework to minimize total inference latency.<n>We show that the proposed method significantly reduces cold-start latency compared to baseline strategies.
arXiv Detail & Related papers (2025-08-15T07:49:22Z)
Worst-Case Symbolic Constraints Analysis and Generalisation with Large Language Models [7.658134651527103]
Worst-case symbolic constraints analysis requires inferring the symbolic constraints that characterise worst-case program executions.<n>We show that even state-of-the-art large language models (LLMs) struggle when applied directly on this task.<n>We propose WARP, an innovative neurosymbolic approach that computes worst-case constraints on smaller concrete input sizes.
arXiv Detail & Related papers (2025-06-09T19:33:30Z)
TCP: a Benchmark for Temporal Constraint-Based Planning [8.977867314314386]
Temporal reasoning and planning are essential capabilities for large language models.<n>We introduce the Temporal Constraint-based Planning benchmark, that jointly assesses both capabilities.<n>We evaluate state-of-the-art LLMs and find that even the strongest models struggle with TCP.
arXiv Detail & Related papers (2025-05-26T12:53:01Z)
Scalable Chain of Thoughts via Elastic Reasoning [61.75753924952059]
Elastic Reasoning is a novel framework for scalable chain of thoughts.<n>It separates reasoning into two phases--thinking and solution--with independently allocated budgets.<n>Our approach produces more concise and efficient reasoning even in unconstrained settings.
arXiv Detail & Related papers (2025-05-08T15:01:06Z)
Haste Makes Waste: Evaluating Planning Abilities of LLMs for Efficient and Feasible Multitasking with Time Constraints Between Actions [56.88110850242265]
We present Recipe2Plan, a novel benchmark framework based on real-world cooking scenarios.<n>Unlike conventional benchmarks, Recipe2Plan challenges agents to optimize cooking time through parallel task execution.
arXiv Detail & Related papers (2025-03-04T03:27:02Z)
Semantic Integrity Constraints: Declarative Guardrails for AI-Augmented Data Processing Systems [39.23499993745249]
We introduce semantic integrity constraints (SICs) for specifying and enforcing correctness conditions over LLM outputs in semantic queries.<n>SICs generalize traditional database integrity constraints to semantic settings, supporting common types of constraints, such as grounding, soundness, and exclusion.<n>We present a system design for integrating SICs into query planning and runtime and discuss its realization in AI-augmented DPSs.
arXiv Detail & Related papers (2025-03-01T19:59:25Z)
Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification [76.14641982122696]
We propose a constraint learning schema for fine-tuning Large Language Models (LLMs) with attribute control. We show that our approach leads to an LLM that produces fewer inappropriate responses while achieving competitive performance on benchmarks and a toxicity detection task.
arXiv Detail & Related papers (2024-10-07T23:38:58Z)
Deep Neural Network for Constraint Acquisition through Tailored Loss Function [0.0]
The significance of learning constraints from data is underscored by its potential applications in real-world problem-solving. This work introduces a novel approach grounded in Deep Neural Network (DNN) based on Symbolic Regression.
arXiv Detail & Related papers (2024-03-04T13:47:33Z)
A General Framework for Learning from Weak Supervision [93.89870459388185]
This paper introduces a general framework for learning from weak supervision (GLWS) with a novel algorithm. Central to GLWS is an Expectation-Maximization (EM) formulation, adeptly accommodating various weak supervision sources. We also present an advanced algorithm that significantly simplifies the EM computational demands.
arXiv Detail & Related papers (2024-02-02T21:48:50Z)
Generating Dispatching Rules for the Interrupting Swap-Allowed Blocking Job Shop Problem Using Graph Neural Network and Reinforcement Learning [21.021840570685264]
The interrupting swap-allowed blocking job shop problem (ISBJSSP) is able to model many manufacturing planning and logistics applications realistically. We introduce a dynamic disjunctive graph formulation characterized by nodes and edges subjected to continuous deletions and additions. A simulator is developed to simulate interruption, swapping, and blocking in the ISBJSSP setting.
arXiv Detail & Related papers (2023-02-05T23:35:21Z)
Augmented Lagrangian Methods for Time-varying Constrained Online Convex Optimization [1.662966122370634]
We consider online convex optimization (OCO) with time-varying loss and constraint functions. We first develop a class of model-based augmented Lagrangian methods (MALM) for time-varying functional constrained OCO. numerical results for several examples of constrained OCO are presented to demonstrate the efficiency of the proposed algorithms.
arXiv Detail & Related papers (2022-05-19T14:03:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.