Related papers: GraphReason: Enhancing Reasoning Capabilities of Large Language Models through A Graph-Based Verification Approach

GraphReason: Enhancing Reasoning Capabilities of Large Language Models through A Graph-Based Verification Approach

URL: http://arxiv.org/abs/2308.09267v4
Date: Sun, 21 Apr 2024 01:45:34 GMT
Title: GraphReason: Enhancing Reasoning Capabilities of Large Language Models through A Graph-Based Verification Approach
Authors: Lang Cao,
Abstract summary: Large Language Models (LLMs) have showcased impressive reasoning capabilities. In this paper, we introduce a novel graph-based method to further augment the reasoning capabilities of LLMs.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) have showcased impressive reasoning capabilities, particularly when guided by specifically designed prompts in complex reasoning tasks such as math word problems. These models typically solve tasks using a chain-of-thought approach, which not only bolsters their reasoning abilities but also provides valuable insights into their problem-solving process. However, there is still significant room for enhancing the reasoning abilities of LLMs. Some studies suggest that the integration of an LLM output verifier can boost reasoning accuracy without necessitating additional model training. In this paper, we follow these studies and introduce a novel graph-based method to further augment the reasoning capabilities of LLMs. We posit that multiple solutions to a reasoning task, generated by an LLM, can be represented as a reasoning graph due to the logical connections between intermediate steps from different reasoning paths. Therefore, we propose the Reasoning Graph Verifier (GraphReason) to analyze and verify the solutions generated by LLMs. By evaluating these graphs, models can yield more accurate and reliable results.Our experimental results show that our graph-based verification method not only significantly enhances the reasoning abilities of LLMs but also outperforms existing verifier methods in terms of improving these models' reasoning performance.

Related papers

Learning to Reason: Training LLMs with GPT-OSS or DeepSeek R1 Reasoning Traces [2.0789230137053014]
Test-time scaling has enabled a new class of Large Language Models (LLMs) that are able to reason through complex problems.<n>We compare the performance of medium-sized LLMs on Math problems after post-training on two kinds of reasoning traces.
arXiv Detail & Related papers (2025-11-24T17:26:58Z)
Mapping the Minds of LLMs: A Graph-Based Analysis of Reasoning LLM [11.181783720439563]
Large Language Models (LLMs) display sophisticated reasoning abilities via extended Chain-of-Thought (CoT) generation.<n>RLMs often demonstrate counterintuitive and unstable behaviors, such as performance degradation under few-shot prompting.<n>We introduce a unified graph-based analytical framework for better modeling the reasoning processes of RLMs.
arXiv Detail & Related papers (2025-05-20T03:54:57Z)
Learn to Think: Bootstrapping LLM Reasoning Capability Through Graph Representation Learning [19.75678229122211]
Large Language Models (LLMs) have achieved remarkable success across various domains.<n>They still face significant challenges, including high computational costs for training and limitations in solving complex reasoning problems.<n>We propose a novel framework that leverages graph learning to enable more flexible and adaptive reasoning capabilities.
arXiv Detail & Related papers (2025-05-09T02:51:22Z)
Guiding Reasoning in Small Language Models with LLM Assistance [23.3038074903744]
Small Language Models cast doubt suitability for tasks demanding deep, multi-step logical deduction. This paper introduces a framework called Small Reasons, Large Hints, which selectively augments SLM reasoning with targeted guidance from large language models. Our experiments on mathematical reasoning datasets demonstrate that targeted external scaffolding significantly improves performance.
arXiv Detail & Related papers (2025-04-14T06:32:45Z)
ReasonGraph: Visualisation of Reasoning Paths [28.906801344540458]
ReasonGraph is a web-based platform for visualizing and analyzing Large Language Models (LLMs) reasoning processes. It supports both sequential and tree-based reasoning methods while integrating with major LLM providers and over fifty state-of-the-art models.
arXiv Detail & Related papers (2025-03-06T00:03:55Z)
Rewarding Graph Reasoning Process makes LLMs more Generalized Reasoners [30.195361623027313]
Process Reward Models (PRMs) have demonstrated exceptional promise in enhancing reasoning by providing step-wise feedback. We introduce GraphSILO, the largest dataset for graph reasoning problems with fine-grained step-wise labels. We train GraphPRM, the first PRM designed for graph reasoning problems, and evaluate its effectiveness in two key settings.
arXiv Detail & Related papers (2025-03-02T10:39:40Z)
Reasoning with Graphs: Structuring Implicit Knowledge to Enhance LLMs Reasoning [73.2950349728376]
Large language models (LLMs) have demonstrated remarkable success across a wide range of tasks. However, they still encounter challenges in reasoning tasks that require understanding and inferring relationships between pieces of information. This challenge is particularly pronounced in tasks involving multi-step processes, such as logical reasoning and multi-hop question answering. We propose Reasoning with Graphs (RwG) by first constructing explicit graphs from the context.
arXiv Detail & Related papers (2025-01-14T05:18:20Z)
Make LLMs better zero-shot reasoners: Structure-orientated autonomous reasoning [52.83539473110143]
We introduce a novel structure-oriented analysis method to help Large Language Models (LLMs) better understand a question. To further improve the reliability in complex question-answering tasks, we propose a multi-agent reasoning system, Structure-oriented Autonomous Reasoning Agents (SARA) Extensive experiments verify the effectiveness of the proposed reasoning system. Surprisingly, in some cases, the system even surpasses few-shot methods.
arXiv Detail & Related papers (2024-10-18T05:30:33Z)
Enhancing Logical Reasoning in Large Language Models through Graph-based Synthetic Data [53.433309883370974]
This work explores the potential and limitations of using graph-based synthetic reasoning data as training signals to enhance Large Language Models' reasoning capabilities. Our experiments, conducted on two established natural language reasoning tasks, demonstrate that supervised fine-tuning with synthetic graph-based reasoning data effectively enhances LLMs' reasoning performance without compromising their effectiveness on other standard evaluation benchmarks.
arXiv Detail & Related papers (2024-09-19T03:39:09Z)
Deconfounded Causality-aware Parameter-Efficient Fine-Tuning for Problem-Solving Improvement of LLMs [12.48241058167222]
Large Language Models (LLMs) have demonstrated remarkable efficiency in tackling various tasks based on human instructions. But studies reveal that they often struggle with tasks requiring reasoning, such as math or physics limitation. This raises questions about whether LLMs truly comprehend embedded knowledge or merely learn to replicate the token distribution without a true understanding of the content. We propose Decon Causal Adaptation (DCA), a novel parameter-efficient fine-tuning (PEFT) method to enhance the model's reasoning capabilities.
arXiv Detail & Related papers (2024-09-04T13:17:09Z)
Revisiting the Graph Reasoning Ability of Large Language Models: Case Studies in Translation, Connectivity and Shortest Path [53.71787069694794]
We focus on the graph reasoning ability of Large Language Models (LLMs) We revisit the ability of LLMs on three fundamental graph tasks: graph description translation, graph connectivity, and the shortest-path problem. Our findings suggest that LLMs can fail to understand graph structures through text descriptions and exhibit varying performance for all these fundamental tasks.
arXiv Detail & Related papers (2024-08-18T16:26:39Z)
Step-by-Step Reasoning to Solve Grid Puzzles: Where do LLMs Falter? [36.14795256060537]
We develop GridPuzzle, an evaluation dataset comprising 274 grid-based puzzles with different complexities. Second, we propose a new error taxonomy derived from manual analysis of reasoning chains from LLMs including GPT-4, Claude-3, Gemini, Mistral, and Llama-2. Third, we develop an LLM-based framework for large-scale subjective evaluation (i.e., identifying errors) and an objective metric, PuzzleEval, to evaluate the correctness of reasoning chains.
arXiv Detail & Related papers (2024-07-20T07:43:07Z)
Can LLM Graph Reasoning Generalize beyond Pattern Memorization? [46.93972334344908]
We evaluate whether large language models (LLMs) can go beyond semantic, numeric, structural, reasoning patterns in the synthetic training data and improve utility on real-world graph-based tasks. We find that while post-training alignment is most promising for real-world tasks, empowering LLM graph reasoning to go beyond pattern remains an open research question.
arXiv Detail & Related papers (2024-06-23T02:59:15Z)
LogicAsker: Evaluating and Improving the Logical Reasoning Ability of Large Language Models [63.14196038655506]
We introduce LogicAsker, a novel approach for evaluating and enhancing the logical reasoning capabilities of large language models (LLMs) Our methodology reveals significant gaps in LLMs' learning of logical rules, with identified reasoning failures ranging from 29% to 90% across different models. We leverage these findings to construct targeted demonstration examples and fine-tune data, notably enhancing logical reasoning in models like GPT-4o by up to 5%.
arXiv Detail & Related papers (2024-01-01T13:53:53Z)
A Closer Look at the Self-Verification Abilities of Large Language Models in Logical Reasoning [73.77088902676306]
We take a closer look at the self-verification abilities of large language models (LLMs) in the context of logical reasoning. Our main findings suggest that existing LLMs could struggle to identify fallacious reasoning steps accurately and may fall short of guaranteeing the validity of self-verification methods.
arXiv Detail & Related papers (2023-11-14T07:13:10Z)
Towards LogiGLUE: A Brief Survey and A Benchmark for Analyzing Logical Reasoning Capabilities of Language Models [56.34029644009297]
Large language models (LLMs) have demonstrated the ability to overcome various limitations of formal Knowledge Representation (KR) systems. LLMs excel most in abductive reasoning, followed by deductive reasoning, while they are least effective at inductive reasoning. We study single-task training, multi-task training, and "chain-of-thought" knowledge distillation fine-tuning technique to assess the performance of model.
arXiv Detail & Related papers (2023-10-02T01:00:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.