Related papers: Are Language Models Up to Sequential Optimization Problems? From Evaluation to a Hegelian-Inspired Enhancement

Are Language Models Up to Sequential Optimization Problems? From Evaluation to a Hegelian-Inspired Enhancement

URL: http://arxiv.org/abs/2502.02573v1
Date: Tue, 04 Feb 2025 18:47:31 GMT
Title: Are Language Models Up to Sequential Optimization Problems? From Evaluation to a Hegelian-Inspired Enhancement
Authors: Soheil Abbasloo,
Abstract summary: Large Language Models (LLMs) have demonstrated impressive capabilities across numerous fields.<n>This paper explores the proficiency of LLMs in handling Sequential Optimization Problems (SOPs)<n>We introduce WorldGen, a dynamic framework for generating unseen SOPs with controllable complexities.<n>Inspired by the influential framework of Hegelian Dialectics, we propose ACE, demonstrating how the performance of LLMs in SOP contexts can be significantly improved.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) have demonstrated impressive capabilities across numerous fields, presenting an opportunity to revolutionize optimization problem-solving, a crucial, ubiquitous, and complex domain. This paper explores the proficiency of LLMs in handling Sequential Optimization Problems (SOPs). We introduce WorldGen, a dynamic framework for generating unseen SOPs with controllable complexities, to evaluate LLM performance. Our initial observations reveal that while LLMs perform well on simple SOPs, their performance significantly degrades with increased complexity. Motivated by this, we revisit philosophical hypotheses on reasoning to enhance LLM performance. Inspired by the influential framework of Hegelian Dialectics, we propose ACE, demonstrating how the performance of LLMs in SOP contexts can be significantly improved without any retraining or further fine-tuning.

Related papers

OPT-BENCH: Evaluating LLM Agent on Large-Scale Search Spaces Optimization Problems [19.586884180343038]
OPT-BENCH is a benchmark designed to evaluate Large Language Models (LLMs) on large-scale search space optimization problems.<n> OPT-Agent emulates human reasoning when tackling complex problems by generating, validating, and iteratively improving solutions through historical feedback.
arXiv Detail & Related papers (2025-06-12T14:46:41Z)
Large Language Models as Particle Swarm Optimizers [0.0]
In LMPSO, the velocity of each particle is represented as a prompt that generates the next candidate solution. The proposed LMPSO approach is evaluated across multiple problem domains, including the Traveling Salesman Problem (TSP) Experimental results demonstrate that LMPSO is particularly effective for solving problems where solutions are represented as structured sequences.
arXiv Detail & Related papers (2025-04-12T15:04:13Z)
Process-based Self-Rewarding Language Models [47.119444722849025]
Large Language Models have demonstrated outstanding performance across various downstream tasks and have been widely applied in multiple scenarios. Human-annotated preference data is used for training to further improve LLMs' performance, which is constrained by the upper limit of human performance. We propose the Process-based Self-Rewarding pipeline for language models, which introduces long-thought reasoning, step-wise LLM-as-a-Judge, and step-wise preference optimization.
arXiv Detail & Related papers (2025-03-05T18:58:44Z)
Can Large Language Models Be Trusted as Black-Box Evolutionary Optimizers for Combinatorial Problems? [8.082897040940447]
Large Language Models (LLMs) offer a game-changing solution with their extensive knowledge and could democratize the optimization paradigm.<n>It is therefore imperative to evaluate the suitability of LLMs as evolutionary mechanism (EVO)
arXiv Detail & Related papers (2025-01-25T05:19:19Z)
EVOLvE: Evaluating and Optimizing LLMs For Exploration [76.66831821738927]
Large language models (LLMs) remain under-studied in scenarios requiring optimal decision-making under uncertainty. We measure LLMs' (in)ability to make optimal decisions in bandits, a state-less reinforcement learning setting relevant to many applications. Motivated by the existence of optimal exploration algorithms, we propose efficient ways to integrate this algorithmic knowledge into LLMs.
arXiv Detail & Related papers (2024-10-08T17:54:03Z)
Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning [53.6472920229013]
Large Language Models (LLMs) have demonstrated impressive capability in many natural language tasks. LLMs are prone to produce errors, hallucinations and inconsistent statements when performing multi-step reasoning. We introduce Q*, a framework for guiding LLMs decoding process with deliberative planning.
arXiv Detail & Related papers (2024-06-20T13:08:09Z)
Exploring and Benchmarking the Planning Capabilities of Large Language Models [57.23454975238014]
This work lays the foundations for improving planning capabilities of large language models (LLMs) We construct a comprehensive benchmark suite encompassing both classical planning benchmarks and natural language scenarios. We investigate the use of many-shot in-context learning to enhance LLM planning, exploring the relationship between increased context length and improved planning performance.
arXiv Detail & Related papers (2024-06-18T22:57:06Z)
One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models [67.49462724595445]
Retrieval-augmented generation (RAG) is a promising way to improve large language models (LLMs) We propose a novel method that involves learning scalable and pluggable virtual tokens for RAG.
arXiv Detail & Related papers (2024-05-30T03:44:54Z)
Towards Efficient LLM Grounding for Embodied Multi-Agent Collaboration [70.09561665520043]
We propose a novel framework for multi-agent collaboration that introduces Reinforced Advantage feedback (ReAd) for efficient self-refinement of plans. We provide theoretical analysis by extending advantage-weighted regression in reinforcement learning to multi-agent systems. Experiments on Over-AI and a difficult variant of RoCoBench show that ReAd surpasses baselines in success rate, and also significantly decreases the interaction steps of agents.
arXiv Detail & Related papers (2024-05-23T08:33:19Z)
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing [56.75702900542643]
We introduce AlphaLLM for the self-improvements of Large Language Models.<n>It integrates Monte Carlo Tree Search (MCTS) with LLMs to establish a self-improving loop.<n>Our experimental results show that AlphaLLM significantly enhances the performance of LLMs without additional annotations.
arXiv Detail & Related papers (2024-04-18T15:21:34Z)
Exploring the True Potential: Evaluating the Black-box Optimization Capability of Large Language Models [32.859634302766146]
Large language models (LLMs) have demonstrated exceptional performance in natural language processing tasks. This paper endeavors to offer deep insights into the potential of LLMs in optimization. Our findings reveal both the limitations and advantages of LLMs in optimization.
arXiv Detail & Related papers (2024-04-09T13:17:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.