Related papers: LLMs for Cold-Start Cutting Plane Separator Configuration

LLMs for Cold-Start Cutting Plane Separator Configuration

URL: http://arxiv.org/abs/2412.12038v1
Date: Mon, 16 Dec 2024 18:03:57 GMT
Title: LLMs for Cold-Start Cutting Plane Separator Configuration
Authors: Connor Lawless, Yingxi Li, Anders Wikum, Madeleine Udell, Ellen Vitercik,
Abstract summary: Mixed integer linear programming solvers ship with a staggering number of parameters that are challenging to select a priori for all but expert optimization users.<n>Existing machine learning approaches to configure solvers require training ML models by solving thousands of related MILP instances, generalize poorly to new problem sizes, and often require implementing complex ML pipelines and custom solver interfaces.<n>We present a new LLM-based framework to configure which cutting plane separators to use for a given MILP problem with little to no training data based on characteristics of the instance.
Score: 19.931643536607737
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Mixed integer linear programming (MILP) solvers ship with a staggering number of parameters that are challenging to select a priori for all but expert optimization users, but can have an outsized impact on the performance of the MILP solver. Existing machine learning (ML) approaches to configure solvers require training ML models by solving thousands of related MILP instances, generalize poorly to new problem sizes, and often require implementing complex ML pipelines and custom solver interfaces that can be difficult to integrate into existing optimization workflows. In this paper, we introduce a new LLM-based framework to configure which cutting plane separators to use for a given MILP problem with little to no training data based on characteristics of the instance, such as a natural language description of the problem and the associated LaTeX formulation. We augment these LLMs with descriptions of cutting plane separators available in a given solver, grounded by summarizing the existing research literature on separators. While individual solver configurations have a large variance in performance, we present a novel ensembling strategy that clusters and aggregates configurations to create a small portfolio of high-performing configurations. Our LLM-based methodology requires no custom solver interface, can find a high-performing configuration by solving only a small number of MILPs, and can generate the configuration with simple API calls that run in under a second. Numerical results show our approach is competitive with existing configuration approaches on a suite of classic combinatorial optimization problems and real-world datasets with only a fraction of the training data and computation time.

Related papers

FamilyTool: A Multi-hop Personalized Tool Use Benchmark [94.1158032740113]
We introduce FamilyTool, a novel benchmark grounded in a family-based knowledge graph (KG) FamilyTool challenges Large Language Models with queries spanning 1 to 3 relational hops. Experiments reveal significant performance gaps in state-of-the-art LLMs.
arXiv Detail & Related papers (2025-04-09T10:42:36Z)
Cost-Optimal Grouped-Query Attention for Long-Context LLMs [64.90662568387683]
Building effective Transformer-based large language models (LLMs) has recently become a research focus. We compare models with different parameter sizes, context lengths, and attention head configurations in terms of model performance, computational cost, and memory cost. Our studies show that, when processing sufficiently long sequences, a larger model with fewer attention heads can achieve a lower loss while incurring lower computational and memory costs.
arXiv Detail & Related papers (2025-03-12T17:50:42Z)
IMPROVE: Iterative Model Pipeline Refinement and Optimization Leveraging LLM Agents [17.301758094000125]
Large language model (LLM) agents have emerged as a promising solution to automate the development of computer vision models. We introduce Iterative Refinement, a novel strategy for LLM-driven ML pipeline design. Iterative Refinement improves stability, interpretability, and overall model performance.
arXiv Detail & Related papers (2025-02-25T01:52:37Z)
Scaling Autonomous Agents via Automatic Reward Modeling And Planning [52.39395405893965]
Large language models (LLMs) have demonstrated remarkable capabilities across a range of tasks. However, they still struggle with problems requiring multi-step decision-making and environmental feedback. We propose a framework that can automatically learn a reward model from the environment without human annotations.
arXiv Detail & Related papers (2025-02-17T18:49:25Z)
Towards An Unsupervised Learning Scheme for Efficiently Solving Parameterized Mixed-Integer Programs [6.1860817947800655]
We train an autoencoder for binary variables in an unsupervised learning fashion. We present a strategy to construct a class of cutting plane constraints from the decoder parameters of an offline-trained AE. Their integration into the primal MIP problem leads to a tightened MIP with the reduced feasible region.
arXiv Detail & Related papers (2024-12-23T14:48:32Z)
PickLLM: Context-Aware RL-Assisted Large Language Model Routing [0.5325390073522079]
PickLLM is a lightweight framework that relies on Reinforcement Learning (RL) to route on-the-fly queries to available models. We demonstrate the speed of convergence for different learning rates and improvement in hard metrics such as cost per querying session and overall response latency.
arXiv Detail & Related papers (2024-12-12T06:27:12Z)
Magneto: Combining Small and Large Language Models for Schema Matching [8.387623375871055]
Small language models (SLMs) require training data and large language models (LLMs) often incur high computational costs.<n>We present Magneto, a cost-effective and accurate solution for schema matching.
arXiv Detail & Related papers (2024-12-11T08:35:56Z)
Duo-LLM: A Framework for Studying Adaptive Computation in Large Language Models [16.16372459671255]
Large Language Models (LLMs) typically generate outputs token by token using a fixed compute budget. We propose a novel framework that integrates smaller auxiliary modules within each Feed-Forward Network layer of the LLM. We show that trained routers operate differently from oracles and often yield suboptimal solutions.
arXiv Detail & Related papers (2024-10-01T16:10:21Z)
Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity [59.57065228857247]
Retrieval-augmented Large Language Models (LLMs) have emerged as a promising approach to enhancing response accuracy in several tasks, such as Question-Answering (QA) We propose a novel adaptive QA framework, that can dynamically select the most suitable strategy for (retrieval-augmented) LLMs based on the query complexity. We validate our model on a set of open-domain QA datasets, covering multiple query complexities, and show that ours enhances the overall efficiency and accuracy of QA systems.
arXiv Detail & Related papers (2024-03-21T13:52:30Z)
Small LLMs Are Weak Tool Learners: A Multi-LLM Agent [73.54562551341454]
Large Language Model (LLM) agents significantly extend the capabilities of standalone LLMs. We propose a novel approach that decomposes the aforementioned capabilities into a planner, caller, and summarizer. This modular framework facilitates individual updates and the potential use of smaller LLMs for building each capability.
arXiv Detail & Related papers (2024-01-14T16:17:07Z)
PySCIPOpt-ML: Embedding Trained Machine Learning Models into Mixed-Integer Programs [0.7661676407098753]
We introduce PySCIPOpt-ML, an open-source tool for embedding machine learning predictors into optimisation problems. By interfacing with a broad range of commonly used ML frameworks and an open-source MIP solver, PySCIPOpt-ML provides a way to easily integrate ML constraints into optimisation problems. We present computational results over SurrogateLIB, providing intuition on the scale of ML predictors that can be practically embedded.
arXiv Detail & Related papers (2023-12-13T11:36:55Z)
FederatedScope-LLM: A Comprehensive Package for Fine-tuning Large Language Models in Federated Learning [70.38817963253034]
This paper first discusses these challenges of federated fine-tuning LLMs, and introduces our package FS-LLM as a main contribution. We provide comprehensive federated parameter-efficient fine-tuning algorithm implementations and versatile programming interfaces for future extension in FL scenarios. We conduct extensive experiments to validate the effectiveness of FS-LLM and benchmark advanced LLMs with state-of-the-art parameter-efficient fine-tuning algorithms in FL settings.
arXiv Detail & Related papers (2023-09-01T09:40:36Z)
Automatic MILP Solver Configuration By Learning Problem Similarities [1.1585113506994469]
Mixed Linear Programs (MILP) solvers expose numerous configuration parameters to control their internal algorithms. We aim to predict configuration parameters for unseen problem instances that yield lower-cost solutions without the time overhead of searching-and-evaluating configurations. We show that instances that have similar costs using one solver configuration also have similar costs using another solver configuration in the same runtime environment.
arXiv Detail & Related papers (2023-07-02T21:31:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.