Related papers: EntroCoT: Enhancing Chain-of-Thought via Adaptive Entropy-Guided Segmentation

EntroCoT: Enhancing Chain-of-Thought via Adaptive Entropy-Guided Segmentation

URL: http://arxiv.org/abs/2601.03769v1
Date: Wed, 07 Jan 2026 10:02:27 GMT
Title: EntroCoT: Enhancing Chain-of-Thought via Adaptive Entropy-Guided Segmentation
Authors: Zihang Li, Yuhang Wang, Yikun Zong, Wenhan Yu, Xiaokun Yuan, Runhan Jiang, Zirui Liu, Tong Yang, Arthur Jiang,
Abstract summary: Chain-of-Thought (CoT) prompting has significantly enhanced the mathematical reasoning capabilities of Large Language Models.<n>Existing fine-tuning datasets frequently suffer from the "answer right but reasoning wrong" probelm.<n>This paper proposes EntroCoT, a unified framework for automatically identifying and refining low-quality CoT supervision traces.
Score: 18.606842425858
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Chain-of-Thought (CoT) prompting has significantly enhanced the mathematical reasoning capabilities of Large Language Models. We find existing fine-tuning datasets frequently suffer from the "answer right but reasoning wrong" probelm, where correct final answers are derived from hallucinated, redundant, or logically invalid intermediate steps. This paper proposes EntroCoT, a unified framework for automatically identifying and refining low-quality CoT supervision traces. EntroCoT first proposes an entropy-based mechanism to segment the reasoning trace into multiple steps at uncertain junctures, and then introduces a Monte Carlo rollout-based mechanism to evaluate the marginal contribution of each step. By accurately filtering deceptive reasoning samples, EntroCoT constructs a high-quality dataset where every intermediate step in each reasoning trace facilitates the final answer. Extensive experiments on mathematical benchmarks demonstrate that fine-tuning on the subset constructed by EntroCoT consistently outperforms the baseslines of full-dataset supervision.

Related papers

GHS-TDA: A Synergistic Reasoning Framework Integrating Global Hypothesis Space with Topological Data Analysis [27.271992201673083]
Chain-of-Thought (CoT) has been shown to significantly improve the reasoning accuracy of large language models (LLMs)<n>Existing CoT methods suffer from two fundamental limitations.
arXiv Detail & Related papers (2026-02-10T14:00:30Z)
CoT-Seg: Rethinking Segmentation with Chain-of-Thought Reasoning and Self-Correction [50.67483317563736]
This paper aims to explore a system that can think step-by-step, look up information if needed, generate results, self-evaluate its own results, and refine the results.<n>We introduce CoT-Seg, a training-free framework that rethinks reasoning segmentation by combining chain-of-thought reasoning with self-correction.
arXiv Detail & Related papers (2026-01-24T11:41:54Z)
DiffCoT: Diffusion-styled Chain-of-Thought Reasoning in LLMs [27.185334200898623]
Chain-of-Thought (CoT) reasoning improves multi-step mathematical problem solving in large language models.<n>We propose DiffCoT, a diffusion-styled CoT framework that reformulates CoT reasoning as an iterative denoising process.
arXiv Detail & Related papers (2026-01-07T03:58:42Z)
Decoupling the Effect of Chain-of-Thought Reasoning: A Human Label Variation Perspective [60.45433515408158]
We show that long Chain-of-Thought (CoT) serves as a decisive decision-maker for the top option but fails to function as a granular distribution calibrator for ambiguous tasks.<n>We observe a distinct "decoupled mechanism": while CoT improves distributional alignment, final accuracy is dictated by CoT content.
arXiv Detail & Related papers (2026-01-06T16:26:40Z)
SIM-CoT: Supervised Implicit Chain-of-Thought [108.30049193668083]
Implicit Chain-of-Thought (CoT) methods offer a token-efficient alternative to explicit CoT reasoning in Large Language Models.<n>We identify a core latent instability issue when scaling the computational budget of implicit CoT.<n>We propose SIM-CoT, a plug-and-play training module that introduces step-level supervision to stabilize and enrich the latent reasoning space.
arXiv Detail & Related papers (2025-09-24T17:01:32Z)
SalaMAnder: Shapley-based Mathematical Expression Attribution and Metric for Chain-of-Thought Reasoning [45.78228118909098]
Chain-of-Thought (CoT) prompting enhances the math reasoning capability of large language models (LLMs) to a large margin.<n>We present textbfSalaMAnder (textbfShtextbfaptextbfley-btextbfased textbfMathematical Expression textbfAttribution atextbfnd Mtextbfettextbfric), a theoretically grounded methodology.<n>We
arXiv Detail & Related papers (2025-09-20T07:38:58Z)
On the Diagram of Thought [20.805936414171892]
Large Language Models (LLMs) excel at many tasks but often falter on complex problems that require structured, multi-step reasoning.<n>We introduce the Diagram of Thought (DoT), a new framework that enables a single LLM to build and navigate a mental map of its reasoning.
arXiv Detail & Related papers (2024-09-16T07:01:41Z)
Unveiling the Statistical Foundations of Chain-of-Thought Prompting Methods [59.779795063072655]
Chain-of-Thought (CoT) prompting and its variants have gained popularity as effective methods for solving multi-step reasoning problems. We analyze CoT prompting from a statistical estimation perspective, providing a comprehensive characterization of its sample complexity.
arXiv Detail & Related papers (2024-08-25T04:07:18Z)
Fine-Tuning on Diverse Reasoning Chains Drives Within-Inference CoT Refinement in LLMs [63.36637269634553]
We introduce a novel approach where LLMs are fine-tuned to generate a sequence of Diverse Chains of Thought (DCoT) within a single inference step.<n>We show that fine-tuning on DCoT improves performance over the CoT baseline across model families and scales.<n>Our work is also significant because both quantitative analyses and manual evaluations reveal the observed gains stem from the models' ability to refine an initial reasoning chain.
arXiv Detail & Related papers (2024-07-03T15:01:18Z)
ChainLM: Empowering Large Language Models with Improved Chain-of-Thought Prompting [124.69672273754144]
Chain-of-Thought (CoT) prompting can enhance the reasoning capabilities of large language models (LLMs) Existing CoT approaches usually focus on simpler reasoning tasks and thus result in low-quality and inconsistent CoT prompts. We introduce CoTGenius, a novel framework designed for the automatic generation of superior CoT prompts.
arXiv Detail & Related papers (2024-03-21T11:34:26Z)
Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Models [68.05046964022844]
Large language models (LLMs) have unveiled remarkable reasoning capabilities by exploiting chain-of-thought (CoT) prompting. We propose GeM-CoT, a Generalizable CoT prompting mechanism in Mixed-task scenarios where the type of input questions is unknown. With this technical design, GeM-CoT simultaneously enjoys superior generalization capabilities and remarkable performances on 10 public reasoning tasks and 23 BBH tasks.
arXiv Detail & Related papers (2023-10-10T15:10:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.