Recursion of Thought: A Divide-and-Conquer Approach to Multi-Context
Reasoning with Language Models
- URL: http://arxiv.org/abs/2306.06891v1
- Date: Mon, 12 Jun 2023 06:34:16 GMT
- Title: Recursion of Thought: A Divide-and-Conquer Approach to Multi-Context
Reasoning with Language Models
- Authors: Soochan Lee and Gunhee Kim
- Abstract summary: We propose a new inference framework called Recursion of Thought (RoT)
RoT introduces several special tokens that the models can output to trigger context-related operations.
Experiments with multiple architectures including GPT-3 show that RoT dramatically improves LMs' inference capability to solve problems.
- Score: 58.41943058963672
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generating intermediate steps, or Chain of Thought (CoT), is an effective way
to significantly improve language models' (LM) multi-step reasoning capability.
However, the CoT lengths can grow rapidly with the problem complexity, easily
exceeding the maximum context size. Instead of increasing the context limit,
which has already been heavily investigated, we explore an orthogonal
direction: making LMs divide a problem into multiple contexts. We propose a new
inference framework, called Recursion of Thought (RoT), which introduces
several special tokens that the models can output to trigger context-related
operations. Extensive experiments with multiple architectures including GPT-3
show that RoT dramatically improves LMs' inference capability to solve
problems, whose solution consists of hundreds of thousands of tokens.
Related papers
- Syzygy of Thoughts: Improving LLM CoT with the Minimal Free Resolution [59.39066657300045]
Chain-of-Thought (CoT) prompting enhances the reasoning of large language models (LLMs) by decomposing problems into sequential steps.
We propose Syzygy of Thoughts (SoT)-a novel framework that extends CoT by introducing auxiliary, interrelated reasoning paths.
SoT captures deeper logical dependencies, enabling more robust and structured problem-solving.
arXiv Detail & Related papers (2025-04-13T13:35:41Z) - Towards Understanding Multi-Round Large Language Model Reasoning: Approximability, Learnability and Generalizability [18.54202114336492]
We investigate the approximation, learnability, and generalization properties of multi-round auto-regressive models.
We show that Transformers with finite context windows are universal approximators for steps of Turing-computable functions.
We extend PAC learning to sequence generation and demonstrate that multi-round generation is learnable even when the sequence length exceeds the model's context window.
arXiv Detail & Related papers (2025-03-05T02:50:55Z) - Multidimensional Consistency Improves Reasoning in Language Models [21.989335720239467]
We introduce a framework for testing models for answer consistency across multiple input variations.
We induce variations in (i) order of shots in prompt, (ii) problem phrasing, and (iii) languages used.
Our framework consistently enhances mathematical reasoning performance on both monolingual dataset GSM8K and multilingual dataset MGSM, especially for smaller models.
arXiv Detail & Related papers (2025-03-04T14:41:05Z) - Supervised Chain of Thought [5.389461633686935]
Chain of Thought (CoT) prompting offers a promising approach to solving complex reasoning tasks.
One-prompt-for-all approach poses significant challenges for models to generate the correct reasoning steps.
We show how task-specific supervision is essential for navigating the prompt space accurately and achieving optimal performance.
arXiv Detail & Related papers (2024-10-18T06:25:27Z) - Compositional Hardness of Code in Large Language Models -- A Probabilistic Perspective [6.911107705494142]
A common practice in large language model (LLM) usage is to sample a solution for the entire task within the model's context window.
Previous works have shown that subtask decomposition within the model's context is beneficial for solving such tasks.
arXiv Detail & Related papers (2024-09-26T16:34:35Z) - Cantor: Inspiring Multimodal Chain-of-Thought of MLLM [83.6663322930814]
We argue that converging visual context acquisition and logical reasoning is pivotal for tackling visual reasoning tasks.
We propose an innovative multimodal CoT framework, termed Cantor, characterized by a perception-decision architecture.
Our experiments demonstrate the efficacy of the proposed framework, showing significant improvements in multimodal CoT performance.
arXiv Detail & Related papers (2024-04-24T17:59:48Z) - ChainLM: Empowering Large Language Models with Improved Chain-of-Thought Prompting [124.69672273754144]
Chain-of-Thought (CoT) prompting can enhance the reasoning capabilities of large language models (LLMs)
Existing CoT approaches usually focus on simpler reasoning tasks and thus result in low-quality and inconsistent CoT prompts.
We introduce CoTGenius, a novel framework designed for the automatic generation of superior CoT prompts.
arXiv Detail & Related papers (2024-03-21T11:34:26Z) - Structure Guided Prompt: Instructing Large Language Model in Multi-Step
Reasoning by Exploring Graph Structure of the Text [44.81698187939784]
This paper introduces Structure Guided Prompt, a framework designed to improve the multi-step reasoning capabilities of Large Language Models (LLMs)
Our experiments show that this framework significantly enhances the reasoning capabilities of LLMs, enabling them to excel in a broader spectrum of natural language scenarios.
arXiv Detail & Related papers (2024-02-20T22:56:23Z) - Large Language Models as an Indirect Reasoner: Contrapositive and Contradiction for Automated Reasoning [74.90592233107712]
We propose a Direct-Indirect Reasoning (DIR) method, which considers Direct Reasoning (DR) and Indirect Reasoning (IR) as multiple parallel reasoning paths that are merged to derive the final answer.
Our DIR method is simple yet effective and can be straightforwardly integrated with existing variants of CoT methods.
arXiv Detail & Related papers (2024-02-06T03:41:12Z) - Thought Propagation: An Analogical Approach to Complex Reasoning with Large Language Models [62.96551299003463]
We propose textbftextitThought Propagation (TP) to enhance the complex reasoning ability of Large Language Models.
TP first prompts LLMs to propose and solve a set of analogous problems that are related to the input one.
TP reuses the results of analogous problems to directly yield a new solution or derive a knowledge-intensive plan for execution to amend the initial solution obtained from scratch.
arXiv Detail & Related papers (2023-10-06T01:40:09Z) - Faith and Fate: Limits of Transformers on Compositionality [109.79516190693415]
We investigate the limits of transformer large language models across three representative compositional tasks.
These tasks require breaking problems down into sub-steps and synthesizing these steps into a precise answer.
Our empirical findings suggest that transformer LLMs solve compositional tasks by reducing multi-step compositional reasoning into linearized subgraph matching.
arXiv Detail & Related papers (2023-05-29T23:24:14Z) - Enhancing Chain-of-Thoughts Prompting with Iterative Bootstrapping in Large Language Models [81.01397924280612]
Large language models (LLMs) can achieve highly effective performance on various reasoning tasks by incorporating step-by-step chain-of-thought (CoT) prompting as demonstrations.
We introduce Iter-CoT (Iterative bootstrapping in Chain-of-Thoughts Prompting), an iterative bootstrapping approach for selecting exemplars and generating reasoning chains.
arXiv Detail & Related papers (2023-04-23T13:54:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.