Related papers: Reasoning as State Transition: A Representational Analysis of Reasoning Evolution in Large Language Models

Reasoning as State Transition: A Representational Analysis of Reasoning Evolution in Large Language Models

URL: http://arxiv.org/abs/2602.00770v1
Date: Sat, 31 Jan 2026 15:23:33 GMT
Title: Reasoning as State Transition: A Representational Analysis of Reasoning Evolution in Large Language Models
Authors: Siyuan Zhang, Jialian Li, Yichi Zhang, Xiao Yang, Yinpeng Dong, Hang Su,
Abstract summary: We introduce a representational perspective to investigate the dynamics of the model's internal states.<n>We discover that post-training yields only limited improvement in static initial representation quality.
Score: 50.39102836928242
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Large Language Models have achieved remarkable performance on reasoning tasks, motivating research into how this ability evolves during training. Prior work has primarily analyzed this evolution via explicit generation outcomes, treating the reasoning process as a black box and obscuring internal changes. To address this opacity, we introduce a representational perspective to investigate the dynamics of the model's internal states. Through comprehensive experiments across models at various training stages, we discover that post-training yields only limited improvement in static initial representation quality. Furthermore, we reveal that, distinct from non-reasoning tasks, reasoning involves a significant continuous distributional shift in representations during generation. Comparative analysis indicates that post-training empowers models to drive this transition toward a better distribution for task solving. To clarify the relationship between internal states and external outputs, statistical analysis confirms a high correlation between generation correctness and the final representations; while counterfactual experiments identify the semantics of the generated tokens, rather than additional computation during inference or intrinsic parameter differences, as the dominant driver of the transition. Collectively, we offer a novel understanding of the reasoning process and the effect of training on reasoning enhancement, providing valuable insights for future model analysis and optimization.

Related papers

Schoenfeld's Anatomy of Mathematical Reasoning by Language Models [56.656180566692946]
We adopt Schoenfeld's Episode Theory as an inductive, intermediate-scale lens and introduce ThinkARM (Anatomy of Reasoning in Models)<n>ThinkARM explicitly abstracts reasoning traces into functional reasoning steps such as Analysis, Explore, Implement, verify, etc.<n>We show that episode-level representations make reasoning steps explicit, enabling systematic analysis of how reasoning is structured, stabilized, and altered in modern language models.
arXiv Detail & Related papers (2025-12-23T02:44:25Z)
Emergence of Superposition: Unveiling the Training Dynamics of Chain of Continuous Thought [64.43689151961054]
We theoretically analyze the training dynamics of a simplified two-layer transformer on the directed graph reachability problem.<n>Our analysis reveals that during training using continuous thought, the index-matching logit will first increase and then remain bounded under mild assumptions.
arXiv Detail & Related papers (2025-09-27T15:23:46Z)
Towards Theoretical Understanding of Transformer Test-Time Computing: Investigation on In-Context Linear Regression [16.51420987738846]
Using more test-time computation during language model inference, such as generating more intermediate thoughts or sampling multiple candidate answers, has proven effective.<n>This paper takes an initial step toward bridging the gap between practical language model inference and theoretical transformer analysis.
arXiv Detail & Related papers (2025-08-11T03:05:36Z)
CTRLS: Chain-of-Thought Reasoning via Latent State-Transition [57.51370433303236]
Chain-of-thought (CoT) reasoning enables large language models to break down complex problems into interpretable intermediate steps.<n>We introduce groundingS, a framework that formulates CoT reasoning as a Markov decision process (MDP) with latent state transitions.<n>We show improvements in reasoning accuracy, diversity, and exploration efficiency across benchmark reasoning tasks.
arXiv Detail & Related papers (2025-07-10T21:32:18Z)
More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models [43.465268635499754]
Test-time compute has empowered large language models to generate extended reasoning chains.<n>As generations become longer, models tend to drift away from image-grounded content and rely more heavily on language priors.
arXiv Detail & Related papers (2025-05-23T05:08:40Z)
Counterfactual Generative Modeling with Variational Causal Inference [1.9287470458589586]
We present a novel variational Bayesian causal inference framework to handle counterfactual generative modeling tasks.<n>In experiments, we demonstrate the advantage of our framework compared to state-of-the-art models in counterfactual generative modeling.
arXiv Detail & Related papers (2024-10-16T16:44:12Z)
Calibrating Reasoning in Language Models with Internal Consistency [18.24350001344488]
Large language models (LLMs) have demonstrated impressive capabilities in various reasoning tasks.<n>LLMs often generate text with obvious mistakes and contradictions.<n>In this work, we investigate reasoning in LLMs through the lens of internal representations.
arXiv Detail & Related papers (2024-05-29T02:44:12Z)
Inverse Dynamics Pretraining Learns Good Representations for Multitask Imitation [66.86987509942607]
We evaluate how such a paradigm should be done in imitation learning. We consider a setting where the pretraining corpus consists of multitask demonstrations. We argue that inverse dynamics modeling is well-suited to this setting.
arXiv Detail & Related papers (2023-05-26T14:40:46Z)
Task Formulation Matters When Learning Continually: A Case Study in Visual Question Answering [58.82325933356066]
Continual learning aims to train a model incrementally on a sequence of tasks without forgetting previous knowledge. We present a detailed study of how different settings affect performance for Visual Question Answering.
arXiv Detail & Related papers (2022-09-30T19:12:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.