Related papers: Tree of Thoughts: Deliberate Problem Solving with Large Language Models

Tree of Thoughts: Deliberate Problem Solving with Large Language Models

URL: http://arxiv.org/abs/2305.10601v2
Date: Sun, 3 Dec 2023 22:50:35 GMT
Title: Tree of Thoughts: Deliberate Problem Solving with Large Language Models
Authors: Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L. Griffiths, Yuan Cao, Karthik Narasimhan
Abstract summary: We introduce a new framework for language model inference, Tree of Thoughts (ToT) ToT generalizes over the popular Chain of Thought approach to prompting language models. Our experiments show that ToT significantly enhances language models' problem-solving abilities.
Score: 52.31950122881687
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Language models are increasingly being deployed for general problem solving across a wide range of tasks, but are still confined to token-level, left-to-right decision-making processes during inference. This means they can fall short in tasks that require exploration, strategic lookahead, or where initial decisions play a pivotal role. To surmount these challenges, we introduce a new framework for language model inference, Tree of Thoughts (ToT), which generalizes over the popular Chain of Thought approach to prompting language models, and enables exploration over coherent units of text (thoughts) that serve as intermediate steps toward problem solving. ToT allows LMs to perform deliberate decision making by considering multiple different reasoning paths and self-evaluating choices to decide the next course of action, as well as looking ahead or backtracking when necessary to make global choices. Our experiments show that ToT significantly enhances language models' problem-solving abilities on three novel tasks requiring non-trivial planning or search: Game of 24, Creative Writing, and Mini Crosswords. For instance, in Game of 24, while GPT-4 with chain-of-thought prompting only solved 4% of tasks, our method achieved a success rate of 74%. Code repo with all prompts: https://github.com/princeton-nlp/tree-of-thought-llm.

Related papers

TypedThinker: Typed Thinking Improves Large Language Model Reasoning [44.8904486513791]
We propose TypedThinker, a framework that enhances Large Language Models' problem-solving abilities. TypedThinker addresses two key challenges: selecting appropriate reasoning types for given problems and effectively implementing specific reasoning types. Experimental results demonstrate significant improvements over baseline models, with accuracy increases of 3.4% for Mistral 7B and 16.7% for LLaMA3 8B.
arXiv Detail & Related papers (2024-10-02T18:54:45Z)
BAMO at SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense [0.04096453902709291]
This paper outlines our approach to SemEval 2024 Task 9, BRAINTEASER: A Novel Task Defying Common Sense. The dataset comprises multi-choice questions that challenge models to think "outside of the box" Our best method achieves an overall accuracy of 85 percent on the sentence puzzles subtask.
arXiv Detail & Related papers (2024-06-07T14:01:56Z)
Plan of Thoughts: Heuristic-Guided Problem Solving with Large Language Models [0.0]
We formalize a planning-based approach to perform multi-step problem solving with language models. We demonstrate a superior success rate of 89.4% on the Game of 24 task as compared to existing approaches.
arXiv Detail & Related papers (2024-04-29T18:51:17Z)
Boosting of Thoughts: Trial-and-Error Problem Solving with Large Language Models [48.43678591317425]
Boosting of Thoughts (BoT) is an automated prompting framework for problem solving with Large Language Models. We show that BoT consistently achieves higher or comparable problem-solving rates than other advanced prompting approaches.
arXiv Detail & Related papers (2024-02-17T00:13:36Z)
Code-Switched Language Identification is Harder Than You Think [69.63439391717691]
Code switching is a common phenomenon in written and spoken communication. We look at the application of building CS corpora. We make the task more realistic by scaling it to more languages. We reformulate the task as a sentence-level multi-label tagging problem to make it more tractable.
arXiv Detail & Related papers (2024-02-02T15:38:47Z)
How FaR Are Large Language Models From Agents with Theory-of-Mind? [69.41586417697732]
We propose a new evaluation paradigm for large language models (LLMs): Thinking for Doing (T4D) T4D requires models to connect inferences about others' mental states to actions in social scenarios. We introduce a zero-shot prompting framework, Foresee and Reflect (FaR), which provides a reasoning structure that encourages LLMs to anticipate future challenges.
arXiv Detail & Related papers (2023-10-04T06:47:58Z)
SCREWS: A Modular Framework for Reasoning with Revisions [58.698199183147935]
We present SCREWS, a modular framework for reasoning with revisions. We show that SCREWS unifies several previous approaches under a common framework. We evaluate our framework with state-of-the-art LLMs on a diverse set of reasoning tasks.
arXiv Detail & Related papers (2023-09-20T15:59:54Z)
Making Large Language Models Better Reasoners with Step-Aware Verifier [49.16750018427259]
DIVERSE (Diverse Verifier on Reasoning Step) is a novel approach that further enhances the reasoning capability of language models. We evaluate DIVERSE on the latest language model code-davinci and show that it achieves new state-of-the-art results on six of eight reasoning benchmarks.
arXiv Detail & Related papers (2022-06-06T03:38:36Z)
Chain of Thought Prompting Elicits Reasoning in Large Language Models [56.811278668446825]
This paper explores the ability of language models to generate a coherent chain of thought. Experiments show that inducing a chain of thought via prompting can enable sufficiently large language models to better perform reasoning tasks.
arXiv Detail & Related papers (2022-01-28T02:33:07Z)
Learning to Generalize for Sequential Decision Making [19.075378799280728]
We introduce a teacher-student imitation learning methodology and a means of converting a reinforcement learning model into a natural language understanding model. We show that models can learn faster and generalize more, leveraging both the imitation learning and the reformulation.
arXiv Detail & Related papers (2020-10-05T18:00:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.