TRUE: A Trustworthy Unified Explanation Framework for Large Language Model Reasoning
- URL: http://arxiv.org/abs/2602.18905v1
- Date: Sat, 21 Feb 2026 17:00:54 GMT
- Title: TRUE: A Trustworthy Unified Explanation Framework for Large Language Model Reasoning
- Authors: Yujiao Yang,
- Abstract summary: Large language models (LLMs) have demonstrated strong capabilities in complex reasoning tasks, yet their decision-making processes remain difficult to interpret.<n>We propose the Trustworthy Unified Explanation Framework (TRUE), which integrates executable reasoning verification, feasible-region directed acyclic graph (DAG) modeling, and causal failure mode analysis.
- Score: 0.2538209532048867
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) have demonstrated strong capabilities in complex reasoning tasks, yet their decision-making processes remain difficult to interpret. Existing explanation methods often lack trustworthy structural insight and are limited to single-instance analysis, failing to reveal reasoning stability and systematic failure mechanisms. To address these limitations, we propose the Trustworthy Unified Explanation Framework (TRUE), which integrates executable reasoning verification, feasible-region directed acyclic graph (DAG) modeling, and causal failure mode analysis. At the instance level, we redefine reasoning traces as executable process specifications and introduce blind execution verification to assess operational validity. At the local structural level, we construct feasible-region DAGs via structure-consistent perturbations, enabling explicit characterization of reasoning stability and the executable region in the local input space. At the class level, we introduce a causal failure mode analysis method that identifies recurring structural failure patterns and quantifies their causal influence using Shapley values. Extensive experiments across multiple reasoning benchmarks demonstrate that the proposed framework provides multi-level, verifiable explanations, including executable reasoning structures for individual instances, feasible-region representations for neighboring inputs, and interpretable failure modes with quantified importance at the class level. These results establish a unified and principled paradigm for improving the interpretability and reliability of LLM reasoning systems.
Related papers
- X-RAY: Mapping LLM Reasoning Capability via Formalized and Calibrated Probes [11.988348978958376]
Large language models (LLMs) achieve promising performance, yet their ability to reason remains poorly understood.<n>We present X-Ray, an explainable reasoning analysis system that maps the LLM reasoning capability using calibrated, formally verified probes.<n>We evaluate state-of-the-art LLMs on problems ranging from junior-level to advanced in mathematics, physics, and chemistry.
arXiv Detail & Related papers (2026-03-05T15:34:22Z) - Step-Aware Policy Optimization for Reasoning in Diffusion Large Language Models [57.42778606399764]
Diffusion language models (dLLMs) offer a promising, non-autoregressive paradigm for text generation.<n>Current reinforcement learning approaches often rely on sparse, outcome-based rewards.<n>We argue that this stems from a fundamental mismatch with the natural structure of reasoning.
arXiv Detail & Related papers (2025-10-02T00:34:15Z) - Structuring Reasoning for Complex Rules Beyond Flat Representations [37.11501169845084]
We propose a novel framework inspired by expert human reasoning processes.<n>The Dynamic Adjudication template ( DAT) structures the inference mechanism into three methodical stages.<n> DAT consistently outperforms conventional Chain-of-Thought (CoT) approaches in complex rule-based tasks.
arXiv Detail & Related papers (2025-10-01T04:10:13Z) - Implicit Reasoning in Large Language Models: A Comprehensive Survey [67.53966514728383]
Large Language Models (LLMs) have demonstrated strong generalization across a wide range of tasks.<n>Recent studies have shifted attention from explicit chain-of-thought prompting toward implicit reasoning.<n>This survey introduces a taxonomy centered on execution paradigms, shifting the focus from representational forms to computational strategies.
arXiv Detail & Related papers (2025-09-02T14:16:02Z) - Explainable Chain-of-Thought Reasoning: An Empirical Analysis on State-Aware Reasoning Dynamics [69.00587226225232]
We introduce a state-aware transition framework that abstracts CoT trajectories into structured latent dynamics.<n>To characterize the global structure of reasoning, we model their progression as a Markov chain.<n>This abstraction supports a range of analyses, including semantic role identification, temporal pattern visualization, and consistency evaluation.
arXiv Detail & Related papers (2025-08-29T18:53:31Z) - CTRLS: Chain-of-Thought Reasoning via Latent State-Transition [57.51370433303236]
Chain-of-thought (CoT) reasoning enables large language models to break down complex problems into interpretable intermediate steps.<n>We introduce groundingS, a framework that formulates CoT reasoning as a Markov decision process (MDP) with latent state transitions.<n>We show improvements in reasoning accuracy, diversity, and exploration efficiency across benchmark reasoning tasks.
arXiv Detail & Related papers (2025-07-10T21:32:18Z) - On the Eligibility of LLMs for Counterfactual Reasoning: A Decompositional Study [15.617243755155686]
Counterfactual reasoning has emerged as a crucial technique for generalizing the reasoning capabilities of large language models.<n>We propose a decompositional strategy that breaks down the counterfactual generation from causality construction to the reasoning over counterfactual interventions.
arXiv Detail & Related papers (2025-05-17T04:59:32Z) - Structured Prompting and Feedback-Guided Reasoning with LLMs for Data Interpretation [0.0]
Large language models (LLMs) have demonstrated remarkable capabilities in natural language understanding and task generalization.<n>This paper introduces the STROT Framework, a method for structured prompting and feedback-driven transformation logic generation.
arXiv Detail & Related papers (2025-05-03T00:05:01Z) - The Curse of CoT: On the Limitations of Chain-of-Thought in In-Context Learning [56.574829311863446]
Chain-of-Thought (CoT) prompting has been widely recognized for its ability to enhance reasoning capabilities in large language models (LLMs)<n>We demonstrate that CoT and its reasoning variants consistently underperform direct answering across varying model scales and benchmark complexities.<n>Our analysis uncovers a fundamental hybrid mechanism of explicit-implicit reasoning driving CoT's performance in pattern-based ICL.
arXiv Detail & Related papers (2025-04-07T13:51:06Z) - Causality can systematically address the monsters under the bench(marks) [64.36592889550431]
Benchmarks are plagued by various biases, artifacts, or leakage.<n>Models may behave unreliably due to poorly explored failure modes.<n> causality offers an ideal framework to systematically address these challenges.
arXiv Detail & Related papers (2025-02-07T17:01:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.