Related papers: Measuring and Mitigating Post-hoc Rationalization in Reverse Chain-of-Thought Generation

Measuring and Mitigating Post-hoc Rationalization in Reverse Chain-of-Thought Generation

URL: http://arxiv.org/abs/2602.14469v1
Date: Mon, 16 Feb 2026 05:13:06 GMT
Title: Measuring and Mitigating Post-hoc Rationalization in Reverse Chain-of-Thought Generation
Authors: Guangyue Peng, Zongchao Chen, Wen Luo, Yuntao Wen, Wei Li, Ruixiang Feng, Ran Le, Chen Yang, Zhenwei An, Yang Song, Tao Zhang, Houfeng Wang,
Abstract summary: We formalize the phenomenon through a three-level measurement hierarchy.<n>We analyze semantic suppression, the intuitive mitigation strategy that instructs models to ignore the answer.<n>We propose Structural Skeleton-guided Reasoning (SSR), a two-phase approach that first generates an answer-invariant functional skeleton structure.
Score: 27.571918867764932
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reverse Chain-of-Thought Generation (RCG) synthesizes reasoning traces from query-answer pairs, but runs the risk of producing post-hoc rationalizations: when models can see the answer during generation, the answer serves as a cognitive anchor that shapes the entire explanation. We formalize this phenomenon through a three-level measurement hierarchy: lexical, entropic, and probabilistic anchoring, each captures surface artifacts, entropy dynamics, and latent answer dependence, respectively. We analyze semantic suppression, the intuitive mitigation strategy that instructs models to ignore the answer, to find out its counterproduction: while it reduces lexical overlap, it paradoxically increases entropic and probabilistic anchoring. Drawing on Ironic Process Theory from cognitive psychology, we attribute this failure to active monitoring of the forbidden answer, which inadvertently deepens dependence on it. To break this cycle, we propose Structural Skeleton-guided Reasoning (SSR), a two-phase approach that first generates an answer-invariant functional skeleton structure, then uses this skeleton to guide full trace generation. By redirecting the information flow to structural planning rather than answer monitoring, SSR consistently reduces anchoring across all three levels. We further introduce Distilled SSR (SSR-D), which fine-tunes models on teacher-generated SSR traces to ensure reliable structural adherence. Experiments across open-ended reasoning benchmarks demonstrate that SSR-D achieves up to 10% improvement over suppression baselines while preserving out-of-distribution (OOD) generalization.

Related papers

Recursive Think-Answer Process for LLMs and VLMs [54.52289112197118]
We propose an efficient Recursive Think-Answer Process (R-TAP)<n>R-TAP enables models to engage in iterative reasoning cycles and generate more accurate answers.<n>We show that R-TAP-enhanced models consistently outperform conventional single-pass methods.
arXiv Detail & Related papers (2026-03-02T17:20:10Z)
ReBeCA: Unveiling Interpretable Behavior Hierarchy behind the Iterative Self-Reflection of Language Models with Causal Analysis [35.12196884025294]
We introduce textbftextttReBeCA (self-textbftexttReflection textbftextttBehavior explained through textbftextttBehavior), a framework that unveils the interpretable behavioral hierarchy governing the self-reflection outcome.<n>By modeling self-reflection trajectories as causal graphs, ReBeCA isolates genuine determinants of performance.
arXiv Detail & Related papers (2026-02-06T04:00:57Z)
Is my model "mind blurting"? Interpreting the dynamics of reasoning tokens with Recurrence Quantification Analysis (RQA) [1.593065406609169]
We propose Recurrence Quantification Analysis (RQA) as a non-textual alternative for analysing model's reasoning chains at test time.<n>RQA captures signals not reflected by response length, but also substantially improves prediction of task complexity by 8%.
arXiv Detail & Related papers (2026-02-05T23:48:23Z)
APR: Penalizing Structural Redundancy in Large Reasoning Models via Anchor-based Process Rewards [61.52322047892064]
Test-Time Scaling (TTS) has significantly enhanced the capabilities of Large Reasoning Models (LRMs)<n>We observe that LRMs frequently conduct repetitive self-verification without revision even after obtaining the final answer during the reasoning process.<n>We propose Anchor-based Process Reward (APR), a structure-aware reward shaping method that localizes the reasoning anchor and penalizes exclusively the post-anchor AST.
arXiv Detail & Related papers (2026-01-31T14:53:20Z)
Structured Reasoning for Large Language Models [59.215789462977206]
We propose Structured Reasoning (SCR), a framework that decouples reasoning trajectories into explicit, evaluable, and trainable components.<n>SCR substantially improves reasoning efficiency and self-verification.<n>Compared with existing reasoning paradigms, it reduces output token length by up to 50%.
arXiv Detail & Related papers (2026-01-12T04:04:01Z)
Consistency Is Not Always Correct: Towards Understanding the Role of Exploration in Post-Training Reasoning [75.79451512757844]
Foundation models exhibit broad knowledge but limited task-specific reasoning.<n> RLVR and inference scaling motivate post-training strategies such as RLVR and inference scaling.<n>We show that RLVR induces a squeezing effect, reducing reasoning entropy and forgetting some correct paths.
arXiv Detail & Related papers (2025-11-10T18:25:26Z)
From Reasoning to Answer: Empirical, Attention-Based and Mechanistic Insights into Distilled DeepSeek R1 Models [48.01707022738742]
We conduct a three-stage investigation into the interplay between reasoning and answer generation in three distilled DeepSeek R1 models.<n>We demonstrate that including explicit reasoning consistently improves answer quality across diverse domains.<n>Our results show that perturbations to key reasoning tokens can reliably alter the final answers.
arXiv Detail & Related papers (2025-09-28T06:32:21Z)
How do Transformers Learn Implicit Reasoning? [67.02072851088637]
We study how implicit multi-hop reasoning emerges by training transformers from scratch in a controlled symbolic environment.<n>We find that training with atomic triples is not necessary but accelerates learning, and that second-hop generalization relies on query-level exposure to specific compositional structures.
arXiv Detail & Related papers (2025-05-29T17:02:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.