The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning
- URL: http://arxiv.org/abs/2601.06002v2
- Date: Tue, 13 Jan 2026 18:21:01 GMT
- Title: The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning
- Authors: Qiguang Chen, Yantao Du, Ziniu Li, Jinhao Liu, Songyao Duan, Jiarui Guo, Minghao Liu, Jiaheng Liu, Tong Yang, Ge Zhang, Libo Qin, Wanxiang Che, Wenhao Huang,
- Abstract summary: We show that effective and learnable Long CoT trajectories feature stable molecular-like structures in unified view.<n>We introduce Effective Semantic Isomers and show that only bonds promoting fast entropy convergence support stable Long CoT learning.<n>We present Mole-Syn, a distribution-transfer-graph method that guides synthesis of effective Long CoT structures.
- Score: 76.05038073223152
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Large language models (LLMs) often fail to learn effective long chain-of-thought (Long CoT) reasoning from human or non-Long-CoT LLMs imitation. To understand this, we propose that effective and learnable Long CoT trajectories feature stable molecular-like structures in unified view, which are formed by three interaction types: Deep-Reasoning (covalent-like), Self-Reflection (hydrogen-bond-like), and Self-Exploration (van der Waals-like). Analysis of distilled trajectories reveals these structures emerge from Long CoT fine-tuning, not keyword imitation. We introduce Effective Semantic Isomers and show that only bonds promoting fast entropy convergence support stable Long CoT learning, while structural competition impairs training. Drawing on these findings, we present Mole-Syn, a distribution-transfer-graph method that guides synthesis of effective Long CoT structures, boosting performance and RL stability across benchmarks.
Related papers
- Knowledge-Augmented Long-CoT Generation for Complex Biomolecular Reasoning [51.673503054645415]
Biomolecular mechanisms require multi-step reasoning across molecular interactions, signaling cascades, and metabolic pathways.<n>Existing approaches often exacerbate these issues: reasoning steps may deviate from biological facts or fail to capture long mechanistic dependencies.<n>We propose a Knowledge-Augmented Long-CoT Reasoning framework that integrates LLMs with knowledge graph-based multi-hop reasoning chains.
arXiv Detail & Related papers (2025-11-11T09:26:32Z) - Round-trip Reinforcement Learning: Self-Consistent Training for Better Chemical LLMs [51.29260537017623]
Large Language Models (LLMs) are emerging as versatile foundation models for computational chemistry.<n>These models often lack round-trip consistency.<n>We introduce Round-Trip Reinforcement Learning (RTRL), a novel framework that trains a model to improve its consistency.
arXiv Detail & Related papers (2025-10-01T23:58:58Z) - Long-Short Alignment for Effective Long-Context Modeling in LLMs [32.13785291956956]
Large language models (LLMs) have exhibited impressive performance and surprising emergent properties.<n>Length generalization -- the ability to generalize to sequences longer than those seen during training -- is a classical and fundamental problem.<n>We highlight the critical role of textbflong-short alignment -- the consistency of output distributions across sequences of varying lengths.
arXiv Detail & Related papers (2025-06-13T13:25:39Z) - What Makes a Good Reasoning Chain? Uncovering Structural Patterns in Long Chain-of-Thought Reasoning [45.660562905010934]
We present LCoT2Tree, an automated framework that converts sequential LCoTs into hierarchical tree structures.<n>Using graph neural networks (GNNs), we reveal that structural patterns extracted by LCoT2Tree serve as stronger predictors of final performance.<n>Our results underscore the critical role of internal structures of reasoning chains, positioning LCoT2Tree as a powerful tool for diagnosing, interpreting, and improving reasoning in LLMs.
arXiv Detail & Related papers (2025-05-28T09:12:31Z) - Enhancing Long-Chain Reasoning Distillation through Error-Aware Self-Reflection [64.73809794561305]
errOr-aware self-ReflectION (ORION) is a framework that refines teacher CoTs through an Error-Aware Reflection process.<n> Experiments on multiple mathematical reasoning benchmarks demonstrate that ORION consistently improves performance by more than 2% over all baselines.
arXiv Detail & Related papers (2025-05-28T08:57:03Z) - Deconstructing Long Chain-of-Thought: A Structured Reasoning Optimization Framework for Long CoT Distillation [22.875285119636235]
R1 distillation scheme has emerged as a promising approach for training cost-effective models with enhanced reasoning abilities.<n>This study examines the universality of distillation data and identifies key components that enable the efficient transfer of long-chain reasoning capabilities.<n>We propose DLCoT (Deconstructing Long Chain-of-Thought), a distillation data enhancement framework.
arXiv Detail & Related papers (2025-03-20T17:46:38Z) - Unlocking General Long Chain-of-Thought Reasoning Capabilities of Large Language Models via Representation Engineering [59.34894142132706]
Existing work finds that the capability of long CoT reasoning can be efficiently elicited by tuning on only a few examples.<n>This motivates us to investigate whether long CoT reasoning is a general capability for LLMs.<n>We propose GLoRE, a novel representation engineering method to unleash the general long CoT reasoning capabilities of LLMs.
arXiv Detail & Related papers (2025-03-14T11:30:37Z) - When More is Less: Understanding Chain-of-Thought Length in LLMs [51.631483479081645]
Large Language Models (LLMs) employ Chain-of-Thought (CoT) reasoning to deconstruct complex problems.<n>This paper argues that longer CoTs are often presumed superior, arguing that longer is not always better.
arXiv Detail & Related papers (2025-02-11T05:28:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.