Related papers: Re-evaluating LLM-based Heuristic Search: A Case Study on the 3D Packing Problem

Re-evaluating LLM-based Heuristic Search: A Case Study on the 3D Packing Problem

URL: http://arxiv.org/abs/2509.02297v1
Date: Tue, 02 Sep 2025 13:18:47 GMT
Title: Re-evaluating LLM-based Heuristic Search: A Case Study on the 3D Packing Problem
Authors: Guorui Quan, Mingfei Sun, Manuel López-Ibáñez,
Abstract summary: Large Language Models can generate code for searchs, but their application has largely been confined to adjusting simple functions within human-crafted frameworks.<n>We tasked an LLM with building a complete solver for the constrained 3D Packing Problem.<n>Our findings highlight two major barriers to automated design with current LLMs.
Score: 3.473102563471572
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The art of heuristic design has traditionally been a human pursuit. While Large Language Models (LLMs) can generate code for search heuristics, their application has largely been confined to adjusting simple functions within human-crafted frameworks, leaving their capacity for broader innovation an open question. To investigate this, we tasked an LLM with building a complete solver for the constrained 3D Packing Problem. Direct code generation quickly proved fragile, prompting us to introduce two supports: constraint scaffolding--prewritten constraint-checking code--and iterative self-correction--additional refinement cycles to repair bugs and produce a viable initial population. Notably, even within a vast search space in a greedy process, the LLM concentrated its efforts almost exclusively on refining the scoring function. This suggests that the emphasis on scoring functions in prior work may reflect not a principled strategy, but rather a natural limitation of LLM capabilities. The resulting heuristic was comparable to a human-designed greedy algorithm, and when its scoring function was integrated into a human-crafted metaheuristic, its performance rivaled established solvers, though its effectiveness waned as constraints tightened. Our findings highlight two major barriers to automated heuristic design with current LLMs: the engineering required to mitigate their fragility in complex reasoning tasks, and the influence of pretrained biases, which can prematurely narrow the search for novel solutions.

Related papers

G-LNS: Generative Large Neighborhood Search for LLM-Based Automatic Heuristic Design [7.681872137995253]
We propose a generative evolutionary framework that extends Large Neighborhood Search (LNS) operators.<n>G-LNS leverages Large Language Models (LLMs) to co-evolve tightly coupled pairs of destroy and repair operators.<n>Experiments show that G-LNS significantly outperforms LLM-based AHD methods as well as strong classical solvers.
arXiv Detail & Related papers (2026-02-09T04:13:35Z)
An Empirical Study of Reasoning Steps in Thinking Code LLMs [8.653365851909745]
Thinking Large Language Models generate explicit intermediate reasoning traces before final answers.<n>This study examines the reasoning process and quality of thinking LLMs for code generation.
arXiv Detail & Related papers (2025-11-08T06:18:48Z)
LLM Agents Beyond Utility: An Open-Ended Perspective [50.809163251551894]
We augment a pretrained LLM agent with the ability to generate its own tasks, accumulate knowledge, and interact extensively with its environment.<n>It can reliably follow complex multi-step instructions, store and reuse information across runs, and propose and solve its own tasks.<n>It remains sensitive to prompt design, prone to repetitive task generation, and unable to form self-representations.
arXiv Detail & Related papers (2025-10-16T10:46:54Z)
LLM4CMO: Large Language Model-aided Algorithm Design for Constrained Multiobjective Optimization [54.83882149157548]
Large language models (LLMs) offer new opportunities for assisting with algorithm design.<n>We propose LLM4CMO, a novel CMOEA based on a dual-population, two-stage framework.<n>LLMs can serve as efficient co-designers in the development of complex evolutionary optimization algorithms.
arXiv Detail & Related papers (2025-08-16T02:00:57Z)
ReflecSched: Solving Dynamic Flexible Job-Shop Scheduling via LLM-Powered Hierarchical Reflection [4.101501114944147]
ReflecSched is a framework that empowers the LLM beyond a direct scheduler.<n>It distills simulations across multiple planning horizons into a concise, natural-language summary.<n>This summary is then integrated into the prompt of a final decision-making module, guiding it to produce non-myopic actions.
arXiv Detail & Related papers (2025-08-03T11:26:35Z)
ORMind: A Cognitive-Inspired End-to-End Reasoning Framework for Operations Research [56.961539386979354]
We introduce ORMind, a cognitive-inspired framework that enhances optimization through counterfactual reasoning.<n>Our approach emulates human cognition, implementing an end-to-end workflow that transforms requirements into mathematical models and executable code.<n>It is currently being tested internally in Lenovo's AI Assistant, with plans to enhance optimization capabilities for both business and consumer customers.
arXiv Detail & Related papers (2025-06-02T05:11:21Z)
RedAHD: Reduction-Based End-to-End Automatic Heuristic Design with Large Language Models [14.544461392180668]
We propose a novel end-to-end framework, named RedAHD, that enables these LLM-based design methods to operate without the need of humans.<n>More specifically, RedAHD employs LLMs to automate the process of reduction, i.e., transforming the COP at hand into similar COPs that are better-understood.<n>Our experimental results, evaluated on six COPs, show that RedAHD is capable of designing or improved results over the state-of-the-art methods with minimal human involvement.
arXiv Detail & Related papers (2025-05-26T17:21:16Z)
ConceptAgent: LLM-Driven Precondition Grounding and Tree Search for Robust Task Planning and Execution [33.252158560173655]
ConceptAgent is a natural language-driven robotic platform designed for task execution in unstructured environments. We present innovations designed to limit shortcomings, including 1) Predicate Grounding to prevent and recover from infeasible actions, and 2) an embodied version of LLM-guided Monte Carlo Tree Search with self reflection.
arXiv Detail & Related papers (2024-10-08T15:05:40Z)
Cognitive LLMs: Towards Integrating Cognitive Architectures and Large Language Models for Manufacturing Decision-making [51.737762570776006]
LLM-ACTR is a novel neuro-symbolic architecture that provides human-aligned and versatile decision-making. Our framework extracts and embeds knowledge of ACT-R's internal decision-making process as latent neural representations. Our experiments on novel Design for Manufacturing tasks show both improved task performance as well as improved grounded decision-making capability.
arXiv Detail & Related papers (2024-08-17T11:49:53Z)
What's Wrong with Your Code Generated by Large Language Models? An Extensive Study [80.18342600996601]
Large language models (LLMs) produce code that is shorter yet more complicated as compared to canonical solutions. We develop a taxonomy of bugs for incorrect codes that includes three categories and 12 sub-categories, and analyze the root cause for common bug types. We propose a novel training-free iterative method that introduces self-critique, enabling LLMs to critique and correct their generated code based on bug types and compiler feedback.
arXiv Detail & Related papers (2024-07-08T17:27:17Z)
Trust the PRoC3S: Solving Long-Horizon Robotics Problems with LLMs and Constraint Satisfaction [38.683780057806516]
Recent developments in pretrained large language models (LLMs) applied robotics have demonstrated their capacity for sequencing a set of discrete skills to achieve open-ended goals in simple robotic tasks. In this paper, we examine the topic of LLM planning for a set of continuously parameterized skills whose execution must avoid violations of a set of kinematic, geometric, and physical constraints. Experiments across three different simulated 3D domains demonstrate that our proposed strategy, PRoC3S, is capable of solving a wide range of complex manipulation tasks with realistic constraints on continuous parameters much more efficiently and effectively than existing baselines.
arXiv Detail & Related papers (2024-06-08T20:56:14Z)
Accelerate Presolve in Large-Scale Linear Programming via Reinforcement Learning [92.31528918811007]
We propose a simple and efficient reinforcement learning framework -- namely, reinforcement learning for presolve (RL4Presolve) -- to tackle (P1)-(P3) simultaneously. Experiments on two solvers and eight benchmarks (real-world and synthetic) demonstrate that RL4Presolve significantly and consistently improves the efficiency of solving large-scale LPs.
arXiv Detail & Related papers (2023-10-18T09:51:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.