Related papers: Boosting Logical Reasoning in Large Language Models through a New Framework: The Graph of Thought

Boosting Logical Reasoning in Large Language Models through a New Framework: The Graph of Thought

URL: http://arxiv.org/abs/2308.08614v1
Date: Wed, 16 Aug 2023 18:13:27 GMT
Title: Boosting Logical Reasoning in Large Language Models through a New Framework: The Graph of Thought
Authors: Bin Lei, pei-Hung Lin, Chunhua Liao, Caiwen Ding
Abstract summary: Our paper unveils a pioneering prompting technique, dubbed textitGraph of Thoughts (GoT). Our method outperformed GPT-4, achieving accuracy improvements of $89.7%$, $86%$, and $56%$ for each respective task. When juxtaposed with the state-of-the-art prompting method, textitTree of Thought (ToT), our approach registered an average accuracy boost of $23%$, $24%$, and $15%$.
Score: 7.356034193515096
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent advancements in large-scale models, such as GPT-4, have showcased remarkable capabilities in addressing standard queries. However, when facing complex problems that require multi-step logical reasoning, their accuracy dramatically decreases. Current research has explored the realm of \textit{prompting engineering} to bolster the inferential capacities of these models. Our paper unveils a pioneering prompting technique, dubbed \textit{Graph of Thoughts (GoT)}. Through testing on a trio of escalating challenges: the 24-point game, resolution of high-degree polynomial equations, and derivation of formulas for recursive sequences, our method outperformed GPT-4, achieving accuracy improvements of $89.7\%$, $86\%$, and $56\%$ for each respective task. Moreover, when juxtaposed with the state-of-the-art (SOTA) prompting method, \textit{Tree of Thought (ToT)}, our approach registered an average accuracy boost of $23\%$, $24\%$, and $15\%$.

Related papers

On Computational Limits and Provably Efficient Criteria of Visual Autoregressive Models: A Fine-Grained Complexity Analysis [22.641550077885686]
We analyze the computational limits and efficiency criteria of Visual Autoregressive ($mathsf/$) Models. We prove that assuming the Strong Exponential Time Hypothesis ($mathsfSETH$) from fine-grained complexity theory, a sub-quartic time algorithm for $mathsf/$ models is impossible. Our technique will shed light on advancing scalable and efficient image generation in $mathsf/$ frameworks.
arXiv Detail & Related papers (2025-01-08T09:34:15Z)
Evaluating GPT-4 at Grading Handwritten Solutions in Math Exams [48.99818550820575]
We leverage state-of-the-art multi-modal AI models, in particular GPT-4o, to automatically grade handwritten responses to college-level math exams. Using real student responses to questions in a probability theory exam, we evaluate GPT-4o's alignment with ground-truth scores from human graders using various prompting techniques.
arXiv Detail & Related papers (2024-11-07T22:51:47Z)
Seq-VCR: Preventing Collapse in Intermediate Transformer Representations for Enhanced Reasoning [29.39584492735953]
We identify representation collapse in the model's intermediate layers as a key factor limiting their reasoning capabilities. We propose Sequential Variance-Covariance Regularization (Seq-VCR), which enhances the entropy of intermediate representations and prevents collapse.
arXiv Detail & Related papers (2024-11-04T18:14:07Z)
FLARE: Faithful Logic-Aided Reasoning and Exploration [50.9814063216852]
We introduce a novel approach for traversing the problem space using task decompositions. We use the Large Language Models to plan a solution, soft-formalise the query into facts and predicates using a logic programming code. Our method allows us to compute the faithfulness of the reasoning process w.r.t. the generated code and analyse the steps of the multi-hop search without relying on external solvers.
arXiv Detail & Related papers (2024-10-14T19:39:11Z)
Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models [22.425339110551743]
We introduce $textitweak-to-strong search, framing the alignment of a large language model as a test-time greedy search. In controlled-sentiment generation and summarization, we use tuned and untuned $textttgpt2$s to improve the alignment of large models without additional training. In a more difficult instruction-following benchmark, we show that reusing off-the-shelf small models can improve the length-controlled win rates of both white-box and black-box large models.
arXiv Detail & Related papers (2024-05-29T16:55:32Z)
DGoT: Dynamic Graph of Thoughts for Scientific Abstract Generation [4.404836880890741]
We propose a Dynamic Graph of Thought (DGoT) to solve the task of generating scientific paper abstracts. Our method's cost-effectiveness in abstract generation tasks is only 43.7% to 56.4% of other multi-round query prompt approaches.
arXiv Detail & Related papers (2024-03-26T08:47:23Z)
Tree of Thoughts: Deliberate Problem Solving with Large Language Models [52.31950122881687]
We introduce a new framework for language model inference, Tree of Thoughts (ToT) ToT generalizes over the popular Chain of Thought approach to prompting language models. Our experiments show that ToT significantly enhances language models' problem-solving abilities.
arXiv Detail & Related papers (2023-05-17T23:16:17Z)
Progressive-Hint Prompting Improves Reasoning in Large Language Models [63.98629132836499]
This paper proposes a new prompting method, named Progressive-Hint Prompting (PHP) It enables automatic multiple interactions between users and Large Language Models (LLMs) by using previously generated answers as hints to progressively guide toward the correct answers. We conducted extensive and comprehensive experiments on seven benchmarks. The results show that PHP significantly improves accuracy while remaining highly efficient.
arXiv Detail & Related papers (2023-04-19T16:29:48Z)
Reframing Instructional Prompts to GPTk's Language [72.69833640335519]
We propose reframing techniques for model designers to create effective prompts for language models. Our results show that reframing improves few-shot learning performance by 14% while reducing sample complexity. The performance gains are particularly important on large language models, such as GPT3 where tuning models or prompts on large datasets is not feasible.
arXiv Detail & Related papers (2021-09-16T09:44:43Z)
Improving Robustness and Generality of NLP Models Using Disentangled Representations [62.08794500431367]
Supervised neural networks first map an input $x$ to a single representation $z$, and then map $z$ to the output label $y$. We present methods to improve robustness and generality of NLP models from the standpoint of disentangled representation learning. We show that models trained with the proposed criteria provide better robustness and domain adaptation ability in a wide range of supervised learning tasks.
arXiv Detail & Related papers (2020-09-21T02:48:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.