Related papers: Plan of Thoughts: Heuristic-Guided Problem Solving with Large Language Models

Plan of Thoughts: Heuristic-Guided Problem Solving with Large Language Models

URL: http://arxiv.org/abs/2404.19055v1
Date: Mon, 29 Apr 2024 18:51:17 GMT
Title: Plan of Thoughts: Heuristic-Guided Problem Solving with Large Language Models
Authors: Houjun Liu,
Abstract summary: We formalize a planning-based approach to perform multi-step problem solving with language models. We demonstrate a superior success rate of 89.4% on the Game of 24 task as compared to existing approaches.
Score: 0.0
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: While language models (LMs) offer significant capability in zero-shot reasoning tasks across a wide range of domains, they do not perform satisfactorily in problems which requires multi-step reasoning. Previous approaches to mitigate this involves breaking a larger, multi-step task into sub-tasks and asking the language model to generate proposals ("thoughts") for each sub-task and using exhaustive planning approaches such as DFS to compose a solution. In this work, we leverage this idea to introduce two new contributions: first, we formalize a planning-based approach to perform multi-step problem solving with LMs via Partially Observable Markov Decision Processes (POMDPs), with the LM's own reflections about the value of a state used as a search heuristic; second, leveraging the online POMDP solver POMCP, we demonstrate a superior success rate of 89.4% on the Game of 24 task as compared to existing approaches while also offering better anytime performance characteristics than fixed tree-search which is used previously. Taken together, these contributions allow modern LMs to decompose and solve larger-scale reasoning tasks more effectively.

Related papers

Large Language Models as Common-Sense Heuristics [0.9093413254392775]
Large Language Models (LLMs) possess parametrised knowledge across a wide range of topics, enabling them to leverage the natural language descriptions of planning tasks in their solutions. We introduce a novel planning method, which leverages the parametrised knowledge of LLMs by using their output as a for Hill-Climbing Search. Our method outperforms the task success rate of similar systems within a common household environment by 22 percentage points, with consistently executable plans.
arXiv Detail & Related papers (2025-01-31T00:26:38Z)
Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization [50.485788083202124]
Reinforcement Learning (RL) plays a crucial role in aligning large language models with human preferences and improving their ability to perform complex tasks. We introduce Direct Q-function Optimization (DQO), which formulates the response generation process as a Markov Decision Process (MDP) and utilizes the soft actor-critic (SAC) framework to optimize a Q-function directly parameterized by the language model. Experimental results on two math problem-solving datasets, GSM8K and MATH, demonstrate that DQO outperforms previous methods, establishing it as a promising offline reinforcement learning approach for aligning language models.
arXiv Detail & Related papers (2024-10-11T23:29:20Z)
Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning [53.6472920229013]
Large Language Models (LLMs) have demonstrated impressive capability in many natural language tasks. LLMs are prone to produce errors, hallucinations and inconsistent statements when performing multi-step reasoning. We introduce Q*, a framework for guiding LLMs decoding process with deliberative planning.
arXiv Detail & Related papers (2024-06-20T13:08:09Z)
Meta Reasoning for Large Language Models [58.87183757029041]
We introduce Meta-Reasoning Prompting (MRP), a novel and efficient system prompting method for large language models (LLMs) MRP guides LLMs to dynamically select and apply different reasoning methods based on the specific requirements of each task. We evaluate the effectiveness of MRP through comprehensive benchmarks.
arXiv Detail & Related papers (2024-06-17T16:14:11Z)
Robustness Assessment of Mathematical Reasoning in the Presence of Missing and Contradictory Conditions [48.251724997889184]
We develop a benchmark called Problems with Missing and Contradictory conditions (PMC) We introduce two novel metrics to evaluate the performance of few-shot prompting methods in these scenarios. We propose a novel few-shot prompting method called SMT-LIB Prompting (SLP), which utilizes the SMT-LIB language to model the problems instead of solving them directly.
arXiv Detail & Related papers (2024-06-07T16:24:12Z)
Look Before You Decide: Prompting Active Deduction of MLLMs for Assumptive Reasoning [68.83624133567213]
We show that most prevalent MLLMs can be easily fooled by the introduction of a presupposition into the question. We also propose a simple yet effective method, Active Deduction (AD), to encourage the model to actively perform composite deduction.
arXiv Detail & Related papers (2024-04-19T15:53:27Z)
Thought Propagation: An Analogical Approach to Complex Reasoning with Large Language Models [62.96551299003463]
We propose textbftextitThought Propagation (TP) to enhance the complex reasoning ability of Large Language Models. TP first prompts LLMs to propose and solve a set of analogous problems that are related to the input one. TP reuses the results of analogous problems to directly yield a new solution or derive a knowledge-intensive plan for execution to amend the initial solution obtained from scratch.
arXiv Detail & Related papers (2023-10-06T01:40:09Z)
SatLM: Satisfiability-Aided Language Models Using Declarative Prompting [68.40726892904286]
We propose a new satisfiability-aided language modeling (SatLM) approach for improving the reasoning capabilities of large language models (LLMs) We use an LLM to generate a declarative task specification rather than an imperative program and leverage an off-the-shelf automated theorem prover to derive the final answer. We evaluate SATLM on 8 different datasets and show that it consistently outperforms program-aided LMs in the imperative paradigm.
arXiv Detail & Related papers (2023-05-16T17:55:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.