EHOP: A Dataset of Everyday NP-Hard Optimization Problems
- URL: http://arxiv.org/abs/2502.13776v1
- Date: Wed, 19 Feb 2025 14:39:59 GMT
- Title: EHOP: A Dataset of Everyday NP-Hard Optimization Problems
- Authors: Alex Duchnowski, Ellie Pavlick, Alexander Koller,
- Abstract summary: Everyday Hard Optimization Problems (EHOP) is a collection of NP-hard optimization problems expressed in natural language.
EHOP includes problem formulations that could be found in computer science textbooks, versions that are dressed up as problems that could arise in real life, and variants of well-known problems with inverted rules.
We find that state-of-the-art LLMs, across multiple prompting strategies, systematically solve textbook problems more accurately than their real-life and inverted counterparts.
- Score: 66.41749917354159
- License:
- Abstract: We introduce the dataset of Everyday Hard Optimization Problems (EHOP), a collection of NP-hard optimization problems expressed in natural language. EHOP includes problem formulations that could be found in computer science textbooks, versions that are dressed up as problems that could arise in real life, and variants of well-known problems with inverted rules. We find that state-of-the-art LLMs, across multiple prompting strategies, systematically solve textbook problems more accurately than their real-life and inverted counterparts. We argue that this constitutes evidence that LLMs adapt solutions seen during training, rather than leveraging reasoning abilities that would enable them to generalize to novel problems.
Related papers
- BloomWise: Enhancing Problem-Solving capabilities of Large Language Models using Bloom's-Taxonomy-Inspired Prompts [59.83547898874152]
We introduce BloomWise, a new prompting technique, inspired by Bloom's taxonomy, to improve the performance of Large Language Models (LLMs)
The decision regarding the need to employ more sophisticated cognitive skills is based on self-evaluation performed by the LLM.
In extensive experiments across 4 popular math reasoning datasets, we have demonstrated the effectiveness of our proposed approach.
arXiv Detail & Related papers (2024-10-05T09:27:52Z) - MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math Data [20.31528845718877]
Large language models (LLMs) have significantly advanced natural language understanding and demonstrated strong problem-solving abilities.
This paper investigates the mathematical problem-solving capabilities of LLMs using the newly developed "MathOdyssey" dataset.
arXiv Detail & Related papers (2024-06-26T13:02:35Z) - Eliciting Problem Specifications via Large Language Models [4.055489363682198]
Large language models (LLMs) can be utilized to map a problem class into a semi-formal specification.
A cognitive system can then use the problem-space specification to solve multiple instances of problems from the problem class.
arXiv Detail & Related papers (2024-05-20T16:19:02Z) - Distilling Algorithmic Reasoning from LLMs via Explaining Solution Programs [2.3020018305241337]
Distilling explicit chain-of-thought reasoning paths has emerged as an effective method for improving the reasoning abilities of large language models.
We propose a novel approach to distill reasoning abilities from LLMs by leveraging their capacity to explain solutions.
Our experiments demonstrate that learning from explanations enables the Reasoner to more effectively guide program implementation by a Coder.
arXiv Detail & Related papers (2024-04-11T22:19:50Z) - Thought Propagation: An Analogical Approach to Complex Reasoning with Large Language Models [62.96551299003463]
We propose textbftextitThought Propagation (TP) to enhance the complex reasoning ability of Large Language Models.
TP first prompts LLMs to propose and solve a set of analogous problems that are related to the input one.
TP reuses the results of analogous problems to directly yield a new solution or derive a knowledge-intensive plan for execution to amend the initial solution obtained from scratch.
arXiv Detail & Related papers (2023-10-06T01:40:09Z) - Automatically Correcting Large Language Models: Surveying the landscape
of diverse self-correction strategies [104.32199881187607]
Large language models (LLMs) have demonstrated remarkable performance across a wide array of NLP tasks.
A promising approach to rectify these flaws is self-correction, where the LLM itself is prompted or guided to fix problems in its own output.
This paper presents a comprehensive review of this emerging class of techniques.
arXiv Detail & Related papers (2023-08-06T18:38:52Z) - Learning Adaptive Evolutionary Computation for Solving Multi-Objective
Optimization Problems [3.3266268089678257]
This paper proposes a framework that integrates MOEAs with adaptive parameter control using Deep Reinforcement Learning (DRL)
The DRL policy is trained to adaptively set the values that dictate the intensity and probability of mutation for solutions during optimization.
We show the learned policy is transferable, i.e., the policy trained on a simple benchmark problem can be directly applied to solve the complex warehouse optimization problem.
arXiv Detail & Related papers (2022-11-01T22:08:34Z) - Learning Proximal Operators to Discover Multiple Optima [66.98045013486794]
We present an end-to-end method to learn the proximal operator across non-family problems.
We show that for weakly-ized objectives and under mild conditions, the method converges globally.
arXiv Detail & Related papers (2022-01-28T05:53:28Z) - Learning Hard Optimization Problems: A Data Generation Perspective [44.4747903763245]
This paper demonstrates the volatility of the training data to the ability to approximate it.
It proposes a method for producing (exact or approximate) solutions to optimization problems that are amenable to supervised tasks.
arXiv Detail & Related papers (2021-06-04T17:03:44Z) - Total Deep Variation for Linear Inverse Problems [71.90933869570914]
We propose a novel learnable general-purpose regularizer exploiting recent architectural design patterns from deep learning.
We show state-of-the-art performance for classical image restoration and medical image reconstruction problems.
arXiv Detail & Related papers (2020-01-14T19:01:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.