State over Tokens: Characterizing the Role of Reasoning Tokens
- URL: http://arxiv.org/abs/2512.12777v1
- Date: Sun, 14 Dec 2025 17:30:34 GMT
- Title: State over Tokens: Characterizing the Role of Reasoning Tokens
- Authors: Mosh Levy, Zohar Elyoseph, Shauli Ravfogel, Yoav Goldberg,
- Abstract summary: Large Language Models (LLMs) can generate reasoning tokens before their final answer to boost performance on complex tasks.<n>We argue that to truly understand the process that LLMs do, research must move beyond reading the reasoning tokens as text and focus on decoding them as state.
- Score: 37.09286375762863
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) can generate reasoning tokens before their final answer to boost performance on complex tasks. While these sequences seem like human thought processes, empirical evidence reveals that they are not a faithful explanation of the model's actual reasoning process. To address this gap between appearance and function, we introduce the State over Tokens (SoT) conceptual framework. SoT reframes reasoning tokens not as a linguistic narrative, but as an externalized computational state -- the sole persistent information carrier across the model's stateless generation cycles. This explains how the tokens can drive correct reasoning without being a faithful explanation when read as text and surfaces previously overlooked research questions on these tokens. We argue that to truly understand the process that LLMs do, research must move beyond reading the reasoning tokens as text and focus on decoding them as state.
Related papers
- Latent Reasoning with Supervised Thinking States [60.09942890192309]
Reasoning with a chain-of-thought (CoT) enables Large Language Models (LLMs) to solve complex tasks but incurs significant inference costs.<n>We propose Thinking States, a method that performs reasoning em while the input is processing.<n>We show Thinking States leads to stronger reasoning behavior than CoT, successfully extrapolating to longer sequences than seen during training.
arXiv Detail & Related papers (2026-02-09T07:12:41Z) - Reflection Pretraining Enables Token-Level Self-Correction in Biological Sequence Models [82.79223371188756]
Chain-of-Thought (CoT) prompting has advanced task-solving capabilities in natural language processing with large language models.<n>Applying CoT to non-natural language domains, such as protein and RNA language models, is not yet possible.<n>We introduce pretraining, for the first time in a biological sequence model, which enables the model to engage in intermediate reasoning.
arXiv Detail & Related papers (2025-12-24T05:25:17Z) - MARCOS: Deep Thinking by Markov Chain of Continuous Thoughts [82.46857666702924]
We present a new paradigm for reasoning in large language models (LLMs)<n>Instead of autoregressively generating tokens, we model reasoning as a hidden Markov chain of continuous, high-dimensional "thoughts"<n>For the first time, MARCOS achieves performance comparable to token-based CoT, even surpassing it by 4.7% on GSM8K with up to 15.7x speedup in inference.
arXiv Detail & Related papers (2025-09-29T16:44:22Z) - A circuit for predicting hierarchical structure in-context in Large Language Models [19.35678318316516]
Large Language Models (LLMs) excel at in-context learning, the ability to use information provided as context to improve prediction of future tokens.<n>In this study, we design a synthetic in-context learning task, where tokens are repeated with hierarchical dependencies.<n>We find adaptive induction heads that support prediction by learning what to attend to in-context.
arXiv Detail & Related papers (2025-09-25T20:20:23Z) - Mini-Omni-Reasoner: Token-Level Thinking-in-Speaking in Large Speech Models [80.75260664100644]
Mini-Omni-Reasoner is a framework that enables reasoning within speech via a novel "Thinking-in-Speaking" formulation.<n>It interleaves silent reasoning tokens with spoken response tokens at the token level.<n>It achieves a +19.1% gain in arithmetic reasoning and +6.4% in contextual understanding, with shorter outputs and zero decoding latency.
arXiv Detail & Related papers (2025-08-18T15:14:04Z) - Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning [53.57895922042783]
Large Language Models (LLMs) excel at reasoning and planning when trained on chainof-thought (CoT) data.<n>We propose a hybrid representation of the reasoning process, where we partially abstract away the initial reasoning steps using latent discrete tokens.
arXiv Detail & Related papers (2025-02-05T15:33:00Z) - Training Large Language Models to Reason in a Continuous Latent Space [71.0274000348354]
We introduce a new paradigm called Coconut (Chain of Continuous Thought) to explore the potential of reasoning beyond language.<n>Instead of decoding this state into words, we feed it back to the model as the next input embedding directly in the continuous space.<n>This latent reasoning paradigm enables an advanced reasoning pattern, where continuous thoughts can encode multiple alternative next steps.
arXiv Detail & Related papers (2024-12-09T18:55:56Z) - Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking [34.55545753125674]
We present Quiet-STaR, a generalization of the Self-Taught Reasoner.
LMs learn to generate rationales at each token to explain future text.
We find zero-shot improvements on GSM8K and CommonsenseQA.
arXiv Detail & Related papers (2024-03-14T17:58:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.