Combinatorial Optimization via LLM-driven Iterated Fine-tuning
- URL: http://arxiv.org/abs/2503.06917v2
- Date: Fri, 14 Mar 2025 00:16:29 GMT
- Title: Combinatorial Optimization via LLM-driven Iterated Fine-tuning
- Authors: Pranjal Awasthi, Sreenivas Gollapudi, Ravi Kumar, Kamesh Munagala,
- Abstract summary: We present a novel way to integrate flexible, context-dependent constraints into optimization by leveraging Large Language Models (LLMs)<n>Our framework balances locally constraints with rigorous global optimization more effectively than baseline sampling methods.
- Score: 47.66752049943335
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a novel way to integrate flexible, context-dependent constraints into combinatorial optimization by leveraging Large Language Models (LLMs) alongside traditional algorithms. Although LLMs excel at interpreting nuanced, locally specified requirements, they struggle with enforcing global combinatorial feasibility. To bridge this gap, we propose an iterated fine-tuning framework where algorithmic feedback progressively refines the LLM's output distribution. Interpreting this as simulated annealing, we introduce a formal model based on a "coarse learnability" assumption, providing sample complexity bounds for convergence. Empirical evaluations on scheduling, graph connectivity, and clustering tasks demonstrate that our framework balances the flexibility of locally expressed constraints with rigorous global optimization more effectively compared to baseline sampling methods. Our results highlight a promising direction for hybrid AI-driven combinatorial reasoning.
Related papers
- RL-finetuning LLMs from on- and off-policy data with a single algorithm [53.70731390624718]
We introduce a novel reinforcement learning algorithm (AGRO) for fine-tuning large-language models.
AGRO leverages the concept of generation consistency, which states that the optimal policy satisfies the notion of consistency across any possible generation of the model.
We derive algorithms that find optimal solutions via the sample-based policy gradient and provide theoretical guarantees on their convergence.
arXiv Detail & Related papers (2025-03-25T12:52:38Z) - Hint Marginalization for Improved Reasoning in Large Language Models [24.67507932821155]
We present Marginalization, a novel and principled algorithmic framework to enhance the reasoning capabilities of Large Language Models (LLMs)<n>Our approach can be viewed as an iterative sampling strategy for forming a Monte Carlo approximation of an underlying distribution of answers.<n> Empirical evaluation on several benchmark datasets for arithmetic reasoning demonstrates the superiority of the proposed approach.
arXiv Detail & Related papers (2024-12-17T19:45:53Z) - Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification [76.14641982122696]
We propose a constraint learning schema for fine-tuning Large Language Models (LLMs) with attribute control.
We show that our approach leads to an LLM that produces fewer inappropriate responses while achieving competitive performance on benchmarks and a toxicity detection task.
arXiv Detail & Related papers (2024-10-07T23:38:58Z) - LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning [56.273799410256075]
The framework combines Monte Carlo Tree Search (MCTS) with iterative Self-Refine to optimize the reasoning path.
The framework has been tested on general and advanced benchmarks, showing superior performance in terms of search efficiency and problem-solving capability.
arXiv Detail & Related papers (2024-10-03T18:12:29Z) - Evaluation of Prosumer Networks for Peak Load Management in Iran: A Distributed Contextual Stochastic Optimization Approach [0.0]
This paper introduces a novel prosumers network framework aimed at mitigating peak loads in Iran.
A cost-oriented integrated prediction and optimization approach is proposed, empowering prosumers to make informed decisions.
Numerical results highlights that integrating prediction with optimization and implementing a contextual information-sharing network among prosumers significantly reduces peak loads as well as total costs.
arXiv Detail & Related papers (2024-08-31T16:09:38Z) - Amortizing intractable inference in large language models [56.92471123778389]
We use amortized Bayesian inference to sample from intractable posterior distributions.
We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training.
As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem.
arXiv Detail & Related papers (2023-10-06T16:36:08Z) - Federated Conditional Stochastic Optimization [110.513884892319]
Conditional optimization has found in a wide range of machine learning tasks, such as in-variant learning tasks, AUPRC, andAML.
This paper proposes algorithms for distributed federated learning.
arXiv Detail & Related papers (2023-10-04T01:47:37Z) - Constrained Combinatorial Optimization with Reinforcement Learning [0.30938904602244344]
This paper presents a framework to tackle constrained optimization problems using deep Reinforcement Learning (RL)
We extend the Neural Combinatorial Optimization (NCO) theory in order to deal with constraints in its formulation.
In that context, the solution is iteratively constructed based on interactions with the environment.
arXiv Detail & Related papers (2020-06-22T03:13:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.