UNCO: Towards Unifying Neural Combinatorial Optimization through Large Language Model
- URL: http://arxiv.org/abs/2408.12214v1
- Date: Thu, 22 Aug 2024 08:42:44 GMT
- Title: UNCO: Towards Unifying Neural Combinatorial Optimization through Large Language Model
- Authors: Xia Jiang, Yaoxin Wu, Yuan Wang, Yingqian Zhang,
- Abstract summary: We propose a unified neural optimization framework to solve different types of optimization problems (COPs) by a single model.
We use natural language to formulate text-attributed instances for different COPs and encode them in the same embedding space by the large language model (LLM)
Experiments show that the UNCO model can solve multiple COPs after a single-session training, and achieves satisfactory performance that is comparable to several traditional or learning-based baselines.
- Score: 21.232626415696267
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Recently, applying neural networks to address combinatorial optimization problems (COPs) has attracted considerable research attention. The prevailing methods always train deep models independently on specific problems, lacking a unified framework for concurrently tackling various COPs. To this end, we propose a unified neural combinatorial optimization (UNCO) framework to solve different types of COPs by a single model. Specifically, we use natural language to formulate text-attributed instances for different COPs and encode them in the same embedding space by the large language model (LLM). The obtained embeddings are further advanced by an encoder-decoder model without any problem-specific modules, thereby facilitating a unified process of solution construction. We further adopt the conflict gradients erasing reinforcement learning (CGERL) algorithm to train the UNCO model, delivering better performance across different COPs than vanilla multi-objective learning. Experiments show that the UNCO model can solve multiple COPs after a single-session training, and achieves satisfactory performance that is comparable to several traditional or learning-based baselines. Instead of pursuing the best performance for each COP, we explore the synergy between tasks and few-shot generalization based on LLM to inspire future work.
Related papers
- Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo [90.78001821963008]
A wide range of LM applications require generating text that conforms to syntactic or semantic constraints.
We develop an architecture for controlled LM generation based on sequential Monte Carlo (SMC)
Our system builds on the framework of Lew et al. (2023) and integrates with its language model probabilistic programming language.
arXiv Detail & Related papers (2025-04-17T17:49:40Z) - Large Language Models as Particle Swarm Optimizers [0.0]
In LMPSO, the velocity of each particle is represented as a prompt that generates the next candidate solution.
The proposed LMPSO approach is evaluated across multiple problem domains, including the Traveling Salesman Problem (TSP)
Experimental results demonstrate that LMPSO is particularly effective for solving problems where solutions are represented as structured sequences.
arXiv Detail & Related papers (2025-04-12T15:04:13Z) - Large Language Models for Combinatorial Optimization of Design Structure Matrix [4.513609458468522]
Combinatorial optimization (CO) is essential for improving efficiency and performance in engineering applications.
When it comes to real-world engineering problems, algorithms based on pure mathematical reasoning are limited and incapable to capture the contextual nuances necessary for optimization.
This study explores the potential of Large Language Models (LLMs) in solving engineering CO problems by leveraging their reasoning power and contextual knowledge.
arXiv Detail & Related papers (2024-11-19T15:39:51Z) - Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System [75.25394449773052]
Large Language Model (LLM) based multi-agent systems (MAS) show remarkable potential in collaborative problem-solving.
Yet they still face critical challenges: low communication efficiency, poor scalability, and a lack of effective parameter-updating optimization methods.
We present Optima, a novel framework that addresses these issues by significantly enhancing both communication efficiency and task effectiveness.
arXiv Detail & Related papers (2024-10-10T17:00:06Z) - DiffSG: A Generative Solver for Network Optimization with Diffusion Model [75.27274046562806]
Diffusion generative models can consider a broader range of solutions and exhibit stronger generalization by learning parameters.
We propose a new framework, which leverages intrinsic distribution learning of diffusion generative models to learn high-quality solutions.
arXiv Detail & Related papers (2024-08-13T07:56:21Z) - Large Language Model as a Catalyst: A Paradigm Shift in Base Station Siting Optimization [62.16747639440893]
Large language models (LLMs) and their associated technologies advance, particularly in the realms of prompt engineering and agent engineering.
Our proposed framework incorporates retrieval-augmented generation (RAG) to enhance the system's ability to acquire domain-specific knowledge and generate solutions.
arXiv Detail & Related papers (2024-08-07T08:43:32Z) - Solving General Natural-Language-Description Optimization Problems with Large Language Models [34.50671063271608]
We propose a novel framework called OptLLM that augments LLMs with external solvers.
OptLLM accepts user queries in natural language, convert them into mathematical formulations and programming codes, and calls the solvers to calculate the results.
Some features of OptLLM framework have been available for trial since June 2023.
arXiv Detail & Related papers (2024-07-09T07:11:10Z) - Large Language Model-Aided Evolutionary Search for Constrained Multiobjective Optimization [15.476478159958416]
We employ a large language model (LLM) to enhance evolutionary search for solving constrained multi-objective optimization problems.
Our aim is to speed up the convergence of the evolutionary population.
arXiv Detail & Related papers (2024-05-09T13:44:04Z) - Instance-Conditioned Adaptation for Large-scale Generalization of Neural Combinatorial Optimization [15.842155380912002]
This work proposes a novel Instance-Conditioned Adaptation Model (ICAM) for better large-scale generalization of neural optimization.
In particular, we design a powerful yet lightweight instance-conditioned Routing adaptation module for the NCO model.
We develop an efficient three-stage reinforcement learning-based training scheme that enables the model to learn cross-scale features without any labeled optimal solution.
arXiv Detail & Related papers (2024-05-03T08:00:19Z) - Characterization of Large Language Model Development in the Datacenter [55.9909258342639]
Large Language Models (LLMs) have presented impressive performance across several transformative tasks.
However, it is non-trivial to efficiently utilize large-scale cluster resources to develop LLMs.
We present an in-depth characterization study of a six-month LLM development workload trace collected from our GPU datacenter Acme.
arXiv Detail & Related papers (2024-03-12T13:31:14Z) - SparseLLM: Towards Global Pruning for Pre-trained Language Models [12.057369029549534]
We propose SparseLLM, a novel framework that redefines the global pruning process into manageable, coordinated subproblems.
SparseLLM's approach conceptualizes LLMs as a chain of modular functions and leverages auxiliary variables for problem decomposition.
It demonstrates significant performance improvements, particularly in high-sparsity regimes.
arXiv Detail & Related papers (2024-02-28T00:09:07Z) - CoLLiE: Collaborative Training of Large Language Models in an Efficient
Way [59.09824823710863]
CoLLiE is an efficient library that facilitates collaborative training of large language models.
With its modular design and comprehensive functionality, CoLLiE offers a balanced blend of efficiency, ease of use, and customization.
arXiv Detail & Related papers (2023-12-01T08:02:16Z) - Improving Machine Translation with Large Language Models: A Preliminary Study with Cooperative Decoding [73.32763904267186]
Large Language Models (LLMs) present the potential for achieving superior translation quality.
We propose Cooperative Decoding (CoDec) which treats NMT systems as a pretranslation model and MT-oriented LLMs as a supplemental solution.
arXiv Detail & Related papers (2023-11-06T03:41:57Z) - Corex: Pushing the Boundaries of Complex Reasoning through Multi-Model Collaboration [83.4031923134958]
Corex is a suite of novel general-purpose strategies that transform Large Language Models into autonomous agents.
Inspired by human behaviors, Corex is constituted by diverse collaboration paradigms including Debate, Review, and Retrieve modes.
We demonstrate that orchestrating multiple LLMs to work in concert yields substantially better performance compared to existing methods.
arXiv Detail & Related papers (2023-09-30T07:11:39Z) - Deep Negative Correlation Classification [82.45045814842595]
Existing deep ensemble methods naively train many different models and then aggregate their predictions.
We propose deep negative correlation classification (DNCC)
DNCC yields a deep classification ensemble where the individual estimator is both accurate and negatively correlated.
arXiv Detail & Related papers (2022-12-14T07:35:20Z) - When to Update Your Model: Constrained Model-based Reinforcement
Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL)
Our follow-up derived bounds reveal the relationship between model shifts and performance improvement.
A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z) - On the Generalization of Neural Combinatorial Optimization Heuristics [0.7049738935364298]
We show that our proposed meta-learning approach significantly improves the generalization of two state-of-the-art models.
We formalize solving a CO problem over a given instance distribution as a separate learning task.
We investigate meta-learning techniques to learn a model on a variety of tasks, in order to optimize its capacity to adapt to new tasks.
arXiv Detail & Related papers (2022-06-01T22:39:35Z) - Multi-objective Pointer Network for Combinatorial Optimization [10.286195356515355]
Multi-objective optimization problems (MOCOPs) exist in various real applications.
Deep reinforcement learning (DRL) methods have been proposed to generate approximate optimal solutions to the optimization problems.
This study proposes a single-model deep reinforcement learning framework, called multi-objective Pointer Network (MOPN)
arXiv Detail & Related papers (2022-04-25T14:02:34Z) - Pareto Set Learning for Neural Multi-objective Combinatorial
Optimization [6.091096843566857]
Multiobjective optimization (MOCO) problems can be found in many real-world applications.
We develop a learning-based approach to approximate the whole Pareto set for a given MOCO problem without further search procedure.
Our proposed method significantly outperforms some other methods on the multiobjective traveling salesman problem, multiconditioned vehicle routing problem and multi knapsack problem in terms of solution quality, speed, and model efficiency.
arXiv Detail & Related papers (2022-03-29T09:26:22Z) - Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems.
Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC.
We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z) - Reversible Action Design for Combinatorial Optimization with
Reinforcement Learning [35.50454156611722]
Reinforcement learning (RL) has recently emerged as a new framework to tackle these problems.
We propose a general RL framework that not only exhibits state-of-the-art empirical performance but also generalizes to a variety class of COPs.
arXiv Detail & Related papers (2021-02-14T18:05:42Z) - Conditional Generative Modeling via Learning the Latent Space [54.620761775441046]
We propose a novel framework for conditional generation in multimodal spaces.
It uses latent variables to model generalizable learning patterns.
At inference, the latent variables are optimized to find optimal solutions corresponding to multiple output modes.
arXiv Detail & Related papers (2020-10-07T03:11:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.