Related papers: Towards Faithful Chain-of-Thought: Large Language Models are Bridging Reasoners

Towards Faithful Chain-of-Thought: Large Language Models are Bridging Reasoners

URL: http://arxiv.org/abs/2405.18915v1
Date: Wed, 29 May 2024 09:17:46 GMT
Title: Towards Faithful Chain-of-Thought: Large Language Models are Bridging Reasoners
Authors: Jiachun Li, Pengfei Cao, Yubo Chen, Kang Liu, Jun Zhao,
Abstract summary: Large language models (LLMs) suffer from serious unfaithful chain-of-thought (CoT) issues. We first study the CoT faithfulness issue at the granularity of CoT steps, identify two reasoning paradigms. We then conduct a joint analysis of the causal relevance among the context, CoT, and answer during reasoning.
Score: 19.40385041079461
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) suffer from serious unfaithful chain-of-thought (CoT) issues. Previous work attempts to measure and explain it but lacks in-depth analysis within CoTs and does not consider the interactions among all reasoning components jointly. In this paper, we first study the CoT faithfulness issue at the granularity of CoT steps, identify two reasoning paradigms: centralized reasoning and distributed reasoning, and find their relationship with faithfulness. Subsequently, we conduct a joint analysis of the causal relevance among the context, CoT, and answer during reasoning. The result proves that, when the LLM predicts answers, it can recall correct information missing in the CoT from the context, leading to unfaithfulness issues. Finally, we propose the inferential bridging method to mitigate this issue, in which we use the attribution method to recall information as hints for CoT generation and filter out noisy CoTs based on their semantic consistency and attribution scores. Extensive experiments demonstrate that our approach effectively alleviates the unfaithful CoT problem.

Related papers

Token Signature: Predicting Chain-of-Thought Gains with Token Decoding Feature in Large Language Models [9.282278040339138]
Chain-of-Thought (CoT) technique has proven effective in improving the performance of large language models (LLMs) on complex reasoning tasks.<n>We make a preliminary observation that the monotonicity of token probability distributions may be correlated with the gains achieved through CoT reasoning.<n>We propose two indicators based on the token probability distribution to assess CoT effectiveness across different tasks.
arXiv Detail & Related papers (2025-06-06T11:53:27Z)
Data Fusion for Partial Identification of Causal Effects [62.56890808004615]
We propose a novel partial identification framework that enables researchers to answer key questions.<n>Is the causal effect positive or negative? and How severe must assumption violations be to overturn this conclusion?<n>We apply our framework to the Project STAR study, which investigates the effect of classroom size on students' third-grade standardized test performance.
arXiv Detail & Related papers (2025-05-30T07:13:01Z)
Rewarding Curse: Analyze and Mitigate Reward Modeling Issues for LLM Reasoning [17.6082037230676]
Chain-of-thought (CoT) prompting demonstrates varying performance under different reasoning tasks. Previous work attempts to evaluate it but falls short in providing an in-depth analysis of patterns that influence the CoT. We identify key factors that influence CoT effectiveness on performance improvement, including problem difficulty, information gain, and information flow.
arXiv Detail & Related papers (2025-03-07T07:20:24Z)
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning [55.52872152909785]
Chain-of-thought (CoT) via prompting is the de facto method for eliciting reasoning capabilities from large language models (LLMs) We show that CoT gives strong performance benefits primarily on tasks involving math or logic, with much smaller gains on other types of tasks.
arXiv Detail & Related papers (2024-09-18T17:55:00Z)
Unveiling the Statistical Foundations of Chain-of-Thought Prompting Methods [59.779795063072655]
Chain-of-Thought (CoT) prompting and its variants have gained popularity as effective methods for solving multi-step reasoning problems. We analyze CoT prompting from a statistical estimation perspective, providing a comprehensive characterization of its sample complexity.
arXiv Detail & Related papers (2024-08-25T04:07:18Z)
A Hopfieldian View-based Interpretation for Chain-of-Thought Reasoning [48.51969964676017]
Chain-of-Thought (CoT) holds a significant place in augmenting the reasoning performance for large language models. We propose a Read-and-Control approach for controlling the accuracy of CoT.
arXiv Detail & Related papers (2024-06-18T04:07:13Z)
Mitigating Misleading Chain-of-Thought Reasoning with Selective Filtering [59.495717939664246]
Large language models have manifested remarkable capabilities by leveraging chain-of-thought (CoT) reasoning techniques to solve intricate questions. We propose a novel approach called the selective filtering reasoner (SelF-Reasoner) that assesses the entailment relationship between the question and the candidate reasoning chain. SelF-Reasoner improves the fine-tuned T5 baseline consistently over the ScienceQA, ECQA, and LastLetter tasks.
arXiv Detail & Related papers (2024-03-28T06:28:35Z)
ChainLM: Empowering Large Language Models with Improved Chain-of-Thought Prompting [124.69672273754144]
Chain-of-Thought (CoT) prompting can enhance the reasoning capabilities of large language models (LLMs) Existing CoT approaches usually focus on simpler reasoning tasks and thus result in low-quality and inconsistent CoT prompts. We introduce CoTGenius, a novel framework designed for the automatic generation of superior CoT prompts.
arXiv Detail & Related papers (2024-03-21T11:34:26Z)
Focus on Your Question! Interpreting and Mitigating Toxic CoT Problems in Commonsense Reasoning [21.951313919964484]
Large language models exhibit high-level commonsense reasoning abilities. CoT-like methods lead to a considerable number of originally correct answers turning wrong. We use attribution tracing and causal tracing methods to probe the internal working mechanism of the model.
arXiv Detail & Related papers (2024-02-28T14:09:02Z)
Interpretable Knowledge Tracing via Response Influence-based Counterfactual Reasoning [10.80973695116047]
Knowledge tracing plays a crucial role in computer-aided education and intelligent tutoring systems. Current approaches have explored psychological influences to achieve more explainable predictions. We propose RCKT, a novel response influence-based counterfactual knowledge tracing framework.
arXiv Detail & Related papers (2023-12-01T11:27:08Z)
Disentangled Representation Learning with Transmitted Information Bottleneck [57.22757813140418]
We present textbfDisTIB (textbfTransmitted textbfInformation textbfBottleneck for textbfDisd representation learning), a novel objective that navigates the balance between information compression and preservation.
arXiv Detail & Related papers (2023-11-03T03:18:40Z)
Towards Better Chain-of-Thought Prompting Strategies: A Survey [60.75420407216108]
Chain-of-Thought (CoT) shows its impressive strength when used as a prompting strategy for large language models (LLM) Recent years, the prominent effect of CoT prompting has attracted emerging research. This survey could provide an overall reference on related research.
arXiv Detail & Related papers (2023-10-08T01:16:55Z)
Stress Testing Chain-of-Thought Prompting for Large Language Models [0.16317061277456998]
This report examines the effectiveness of Chain-of-Thought (CoT) prompting in improving the multi-step reasoning abilities of large language models (LLMs) We analyze the impact of three types of CoT prompt perturbations, namely CoT order, CoT values, and CoT operators on the performance of GPT-3 on various tasks.
arXiv Detail & Related papers (2023-09-28T17:21:33Z)
Knowledge-Driven CoT: Exploring Faithful Reasoning in LLMs for Knowledge-intensive Question Answering [17.672572064705445]
Large language models (LLMs) equipped with Chain-of-Thought (CoT) have shown impressive reasoning ability in various downstream tasks. We propose a framework called Knowledge-Driven Chain-of-Thought (KD-CoT) to verify and modify reasoning traces in CoT via interaction with external knowledge.
arXiv Detail & Related papers (2023-08-25T09:23:55Z)
Measuring Faithfulness in Chain-of-Thought Reasoning [19.074147845029355]
Large language models (LLMs) perform better when they produce step-by-step, "Chain-of-Thought" (CoT) reasoning before answering a question. It is unclear if the stated reasoning is a faithful explanation of the model's actual reasoning (i.e., its process for answering the question) We investigate hypotheses for how CoT reasoning may be unfaithful, by examining how the model predictions change when we intervene on the CoT.
arXiv Detail & Related papers (2023-07-17T01:08:39Z)
Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters [82.84696222087396]
Chain-of-Thought (CoT) prompting can dramatically improve the multi-step reasoning abilities of large language models (LLMs) We show that CoT reasoning is possible even with invalid demonstrations.
arXiv Detail & Related papers (2022-12-20T05:20:54Z)
SAIS: Supervising and Augmenting Intermediate Steps for Document-Level Relation Extraction [51.27558374091491]
We propose to explicitly teach the model to capture relevant contexts and entity types by supervising and augmenting intermediate steps (SAIS) for relation extraction. Based on a broad spectrum of carefully designed tasks, our proposed SAIS method not only extracts relations of better quality due to more effective supervision, but also retrieves the corresponding supporting evidence more accurately.
arXiv Detail & Related papers (2021-09-24T17:37:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.