Related papers: DiffCoT: Diffusion-styled Chain-of-Thought Reasoning in LLMs

DiffCoT: Diffusion-styled Chain-of-Thought Reasoning in LLMs

URL: http://arxiv.org/abs/2601.03559v1
Date: Wed, 07 Jan 2026 03:58:42 GMT
Title: DiffCoT: Diffusion-styled Chain-of-Thought Reasoning in LLMs
Authors: Shidong Cao, Hongzhan Lin, Yuxuan Gu, Ziyang Luo, Jing Ma,
Abstract summary: Chain-of-Thought (CoT) reasoning improves multi-step mathematical problem solving in large language models.<n>We propose DiffCoT, a diffusion-styled CoT framework that reformulates CoT reasoning as an iterative denoising process.
Score: 27.185334200898623
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Chain-of-Thought (CoT) reasoning improves multi-step mathematical problem solving in large language models but remains vulnerable to exposure bias and error accumulation, as early mistakes propagate irreversibly through autoregressive decoding. In this work, we propose DiffCoT, a diffusion-styled CoT framework that reformulates CoT reasoning as an iterative denoising process. DiffCoT integrates diffusion principles at the reasoning-step level via a sliding-window mechanism, enabling unified generation and retrospective correction of intermediate steps while preserving token-level autoregression. To maintain causal consistency, we further introduce a causal diffusion noise schedule that respects the temporal structure of reasoning chains. Extensive experiments on three multi-step CoT reasoning benchmarks across diverse model backbones demonstrate that DiffCoT consistently outperforms existing CoT preference optimization methods, yielding improved robustness and error-correction capability in CoT reasoning.

Related papers

ReGuLaR: Variational Latent Reasoning Guided by Rendered Chain-of-Thought [49.203970812338916]
Explicit reasoning chains introduce substantial computational redundancy.<n>Recent latent reasoning methods attempt to mitigate this by compressing reasoning processes into latent space.<n>We propose Rendered CoT-Guided variational Latent Reasoning (ReGuLaR)
arXiv Detail & Related papers (2026-01-30T17:08:06Z)
Breaking the Bottlenecks: Scalable Diffusion Models for 3D Molecular Generation [0.0]
Diffusion models have emerged as a powerful class of generative models for molecular design.<n>Their use remains constrained by long sampling trajectories, variance in the reverse process, and limited structural awareness in denoising dynamics.<n>The Directly Denoising Diffusion Model mitigates these inefficiencies by replacing reverse MCMC updates with deterministic denoising step.
arXiv Detail & Related papers (2026-01-13T20:09:44Z)
EntroCoT: Enhancing Chain-of-Thought via Adaptive Entropy-Guided Segmentation [18.606842425858]
Chain-of-Thought (CoT) prompting has significantly enhanced the mathematical reasoning capabilities of Large Language Models.<n>Existing fine-tuning datasets frequently suffer from the "answer right but reasoning wrong" probelm.<n>This paper proposes EntroCoT, a unified framework for automatically identifying and refining low-quality CoT supervision traces.
arXiv Detail & Related papers (2026-01-07T10:02:27Z)
Decoupling the Effect of Chain-of-Thought Reasoning: A Human Label Variation Perspective [60.45433515408158]
We show that long Chain-of-Thought (CoT) serves as a decisive decision-maker for the top option but fails to function as a granular distribution calibrator for ambiguous tasks.<n>We observe a distinct "decoupled mechanism": while CoT improves distributional alignment, final accuracy is dictated by CoT content.
arXiv Detail & Related papers (2026-01-06T16:26:40Z)
DAPS++: Rethinking Diffusion Inverse Problems with Decoupled Posterior Annealing [5.215481191227242]
We introduce textbfDAPS++, which allows the likelihood term to guide inference more directly while maintaining numerical stability.<n>textbfDAPS++ achieves high computational efficiency and robust reconstruction performance across diverse image restoration tasks.
arXiv Detail & Related papers (2025-11-21T08:28:36Z)
Think Consistently, Reason Efficiently: Energy-Based Calibration for Implicit Chain-of-Thought [33.267497114389734]
Large Language Models (LLMs) have demonstrated strong reasoning capabilities through emphChain-of-Thought (CoT) prompting.<n>CoT methods rely on discrete token-level reasoning processes prone to error propagation and limited by vocabulary.<n>We propose EBM-CoT, an Energy-Based Chain-of-Thought framework that refines latent thought representations through an energy-based model.
arXiv Detail & Related papers (2025-11-10T14:10:58Z)
ConciseHint: Boosting Efficient Reasoning via Continuous Concise Hints during Generation [74.37307916314407]
We propose a framework dubbed ConciseHint, which continuously encourages the reasoning model to speak concisely.<n>Experiments on the state-of-the-art LRMs, including DeepSeek-R1 and Qwen-3 series, demonstrate that our method can effectively produce concise reasoning.
arXiv Detail & Related papers (2025-06-23T16:20:44Z)
SCOUT: Teaching Pre-trained Language Models to Enhance Reasoning via Flow Chain-of-Thought [37.53215651690168]
Chain of Thought (CoT) prompting improves the reasoning performance of large language models (LLMs) by encouraging step by step thinking.<n>While promising, CoT-based approaches often require costly pretraining and lack a principled framework for how reasoning should evolve.<n>We propose SCOUT, a lightweight fine tuning framework that enables Flow CoT style reasoning without the need for pretraining.
arXiv Detail & Related papers (2025-05-30T03:43:24Z)
The Curse of CoT: On the Limitations of Chain-of-Thought in In-Context Learning [56.574829311863446]
Chain-of-Thought (CoT) prompting has been widely recognized for its ability to enhance reasoning capabilities in large language models (LLMs)<n>We demonstrate that CoT and its reasoning variants consistently underperform direct answering across varying model scales and benchmark complexities.<n>Our analysis uncovers a fundamental hybrid mechanism of explicit-implicit reasoning driving CoT's performance in pattern-based ICL.
arXiv Detail & Related papers (2025-04-07T13:51:06Z)
Rethinking Chain-of-Thought from the Perspective of Self-Training [10.722453877596998]
Chain-of-thought (CoT) reasoning has emerged as an effective approach for activating latent capabilities in LLMs.<n>We propose a novel CoT framework to improve reasoning performance.<n>Our framework integrates two key components: (i) a task-specific prompt module that optimize the initial reasoning process, and (ii) an adaptive reasoning module that dynamically refines the reasoning process.
arXiv Detail & Related papers (2024-12-14T13:12:50Z)
Unveiling the Statistical Foundations of Chain-of-Thought Prompting Methods [59.779795063072655]
Chain-of-Thought (CoT) prompting and its variants have gained popularity as effective methods for solving multi-step reasoning problems. We analyze CoT prompting from a statistical estimation perspective, providing a comprehensive characterization of its sample complexity.
arXiv Detail & Related papers (2024-08-25T04:07:18Z)
Fine-Tuning on Diverse Reasoning Chains Drives Within-Inference CoT Refinement in LLMs [63.36637269634553]
We introduce a novel approach where LLMs are fine-tuned to generate a sequence of Diverse Chains of Thought (DCoT) within a single inference step.<n>We show that fine-tuning on DCoT improves performance over the CoT baseline across model families and scales.<n>Our work is also significant because both quantitative analyses and manual evaluations reveal the observed gains stem from the models' ability to refine an initial reasoning chain.
arXiv Detail & Related papers (2024-07-03T15:01:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.