Related papers: Evaluating LLMs for Combinatorial Optimization: One-Phase and Two-Phase Heuristics for 2D Bin-Packing

Evaluating LLMs for Combinatorial Optimization: One-Phase and Two-Phase Heuristics for 2D Bin-Packing

URL: http://arxiv.org/abs/2509.22255v3
Date: Thu, 02 Oct 2025 11:37:39 GMT
Title: Evaluating LLMs for Combinatorial Optimization: One-Phase and Two-Phase Heuristics for 2D Bin-Packing
Authors: Syed Mahbubul Huq, Daniel Brito, Daniel Sikar, Chris Child, Tillman Weyde, Rajesh Mojumder,
Abstract summary: This paper presents an evaluation framework for assessing Large Language Models' (LLMs) capabilities in optimization.<n>We introduce a systematic methodology that combines LLMs with evolutionary algorithms to generate and refine solutions.
Score: 0.6670498055582527
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper presents an evaluation framework for assessing Large Language Models' (LLMs) capabilities in combinatorial optimization, specifically addressing the 2D bin-packing problem. We introduce a systematic methodology that combines LLMs with evolutionary algorithms to generate and refine heuristic solutions iteratively. Through comprehensive experiments comparing LLM generated heuristics against traditional approaches (Finite First-Fit and Hybrid First-Fit), we demonstrate that LLMs can produce more efficient solutions while requiring fewer computational resources. Our evaluation reveals that GPT-4o achieves optimal solutions within two iterations, reducing average bin usage from 16 to 15 bins while improving space utilization from 0.76-0.78 to 0.83. This work contributes to understanding LLM evaluation in specialized domains and establishes benchmarks for assessing LLM performance in combinatorial optimization tasks.

Related papers

SOCRATES: Simulation Optimization with Correlated Replicas and Adaptive Trajectory Evaluations [25.18297372152296]
SOCRATES is a novel two-stage procedure that automates the design of tailored SO algorithms.<n>An ensemble of digital replicas of the real system is used as a testbed to evaluate a set of baseline SO algorithms.<n>An LLM acts as a meta-optimizer, analyzing the performance trajectories of these algorithms to iteratively revise and compose a final, hybrid optimization schedule.
arXiv Detail & Related papers (2025-11-01T19:57:38Z)
LLM4CMO: Large Language Model-aided Algorithm Design for Constrained Multiobjective Optimization [54.83882149157548]
Large language models (LLMs) offer new opportunities for assisting with algorithm design.<n>We propose LLM4CMO, a novel CMOEA based on a dual-population, two-stage framework.<n>LLMs can serve as efficient co-designers in the development of complex evolutionary optimization algorithms.
arXiv Detail & Related papers (2025-08-16T02:00:57Z)
OPT-BENCH: Evaluating LLM Agent on Large-Scale Search Spaces Optimization Problems [19.586884180343038]
OPT-BENCH is a benchmark designed to evaluate Large Language Models (LLMs) on large-scale search space optimization problems.<n> OPT-Agent emulates human reasoning when tackling complex problems by generating, validating, and iteratively improving solutions through historical feedback.
arXiv Detail & Related papers (2025-06-12T14:46:41Z)
GOLLuM: Gaussian Process Optimized LLMs -- Reframing LLM Finetuning through Bayesian Optimization [0.4037357056611557]
Large Language Models (LLMs) can encode complex relationships in their latent spaces.<n>We introduce LLM-based deep kernels, jointly optimized with GPs to preserve the benefits of both.<n>Our method nearly doubles the discovery rate of high-performing reactions compared to static LLM embeddings.
arXiv Detail & Related papers (2025-04-08T17:59:57Z)
Make Optimization Once and for All with Fine-grained Guidance [78.14885351827232]
Learning to Optimize (L2O) enhances optimization efficiency with integrated neural networks.<n>L2O paradigms achieve great outcomes, e.g., refitting, generating unseen solutions iteratively or directly.<n>Our analyses explore general framework for learning optimization, called Diff-L2O, focusing on augmenting solutions from a wider view.
arXiv Detail & Related papers (2025-03-14T14:48:12Z)
Starjob: Dataset for LLM-Driven Job Shop Scheduling [3.435169201271934]
We introduce Starjob, the first supervised dataset for the Job Shop Scheduling Problem (JSSP)<n>We fine-tune the LLaMA 8B 4-bit quantized model with the LoRA method to develop an end-to-end scheduling approach.<n>Our evaluation on standard benchmarks demonstrates that the proposed method not only surpasses traditional Priority Dispatching Rules (PDRs) but also achieves notable improvements over state-of-the-art neural approaches like L2D.
arXiv Detail & Related papers (2025-02-26T15:20:01Z)
Can Large Language Models Be Trusted as Evolutionary Optimizers for Network-Structured Combinatorial Problems? [8.082897040940447]
Large Language Models (LLMs) have shown strong capabilities in language understanding and reasoning across diverse domains.<n>In this work, we propose a systematic framework to evaluate the capability of LLMs to engage with problem structures.<n>We adopt the commonly used evolutionary (EVO) and propose a comprehensive evaluation framework that rigorously assesses the output fidelity of LLM-based operators.
arXiv Detail & Related papers (2025-01-25T05:19:19Z)
LLM2: Let Large Language Models Harness System 2 Reasoning [65.89293674479907]
Large language models (LLMs) have exhibited impressive capabilities across a myriad of tasks, yet they occasionally yield undesirable outputs.<n>We introduce LLM2, a novel framework that combines an LLM with a process-based verifier.<n>LLMs2 is responsible for generating plausible candidates, while the verifier provides timely process-based feedback to distinguish desirable and undesirable outputs.
arXiv Detail & Related papers (2024-12-29T06:32:36Z)
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning [56.273799410256075]
The framework combines Monte Carlo Tree Search (MCTS) with iterative Self-Refine to optimize the reasoning path. The framework has been tested on general and advanced benchmarks, showing superior performance in terms of search efficiency and problem-solving capability.
arXiv Detail & Related papers (2024-10-03T18:12:29Z)
Search-Based LLMs for Code Optimization [16.843870288512363]
Code written by developers usually suffers from efficiency problems and contain various performance bugs. Recent work regards the task as a sequence generation problem, and resorts to deep learning (DL) techniques such as large language models (LLMs) We propose a search-based LLMs framework named SBLLM that enables iterative refinement and discovery of improved optimization methods.
arXiv Detail & Related papers (2024-08-22T06:59:46Z)
LLM as a Complementary Optimizer to Gradient Descent: A Case Study in Prompt Tuning [69.95292905263393]
We show that gradient-based and high-level LLMs can effectively collaborate a combined optimization framework.<n>In this paper, we show that these complementary to each other and can effectively collaborate a combined optimization framework.
arXiv Detail & Related papers (2024-05-30T06:24:14Z)
On Learning to Summarize with Large Language Models as References [101.79795027550959]
Large language models (LLMs) are favored by human annotators over the original reference summaries in commonly used summarization datasets. We study an LLM-as-reference learning setting for smaller text summarization models to investigate whether their performance can be substantially improved.
arXiv Detail & Related papers (2023-05-23T16:56:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.