Related papers: Latent Reasoning with Supervised Thinking States

Latent Reasoning with Supervised Thinking States

URL: http://arxiv.org/abs/2602.08332v1
Date: Mon, 09 Feb 2026 07:12:41 GMT
Title: Latent Reasoning with Supervised Thinking States
Authors: Ido Amos, Avi Caciularu, Mor Geva, Amir Globerson, Jonathan Herzig, Lior Shani, Idan Szpektor,
Abstract summary: Reasoning with a chain-of-thought (CoT) enables Large Language Models (LLMs) to solve complex tasks but incurs significant inference costs.<n>We propose Thinking States, a method that performs reasoning em while the input is processing.<n>We show Thinking States leads to stronger reasoning behavior than CoT, successfully extrapolating to longer sequences than seen during training.
Score: 60.09942890192309
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Reasoning with a chain-of-thought (CoT) enables Large Language Models (LLMs) to solve complex tasks but incurs significant inference costs due to the generation of long rationales. We propose Thinking States, a method that performs reasoning {\em while} the input is processing. Specifically, Thinking States generates sequences of thinking tokens every few input tokens, transforms the thoughts back into embedding space, and adds them to the following input tokens. This has two key advantages. First, it captures the recurrent nature of CoT, but where the thought tokens are generated as input is processing. Second, since the thoughts are represented as tokens, they can be learned from natural language supervision, and using teacher-forcing, which is parallelizable. Empirically, Thinking States outperforms other latent reasoning methods on multiple reasoning tasks, narrowing the gap to CoT on math problems, and matching its performance on 2-Hop QA with improved latency. On state-tracking tasks, we show Thinking States leads to stronger reasoning behavior than CoT, successfully extrapolating to longer sequences than seen during training.

Related papers

LogitsCoder: Towards Efficient Chain-of-Thought Path Search via Logits Preference Decoding for Code Generation [86.08600027874662]
We propose LogitsCoder, a novel framework that enhances chain-of-thought reasoning through lightweight, logit-level control mechanisms for code generation.<n>We show that LogitsCoder produces more efficient and higher-quality reasoning chains, leading to superior code generation performance compared to baseline methods.
arXiv Detail & Related papers (2026-02-15T08:52:19Z)
State over Tokens: Characterizing the Role of Reasoning Tokens [37.09286375762863]
Large Language Models (LLMs) can generate reasoning tokens before their final answer to boost performance on complex tasks.<n>We argue that to truly understand the process that LLMs do, research must move beyond reading the reasoning tokens as text and focus on decoding them as state.
arXiv Detail & Related papers (2025-12-14T17:30:34Z)
MARCOS: Deep Thinking by Markov Chain of Continuous Thoughts [82.46857666702924]
We present a new paradigm for reasoning in large language models (LLMs)<n>Instead of autoregressively generating tokens, we model reasoning as a hidden Markov chain of continuous, high-dimensional "thoughts"<n>For the first time, MARCOS achieves performance comparable to token-based CoT, even surpassing it by 4.7% on GSM8K with up to 15.7x speedup in inference.
arXiv Detail & Related papers (2025-09-29T16:44:22Z)
Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space [62.54887038032942]
We introduce Soft Thinking, a training-free method that emulates human-like "soft" reasoning by generating soft, abstract concept tokens.<n>These concept tokens are created by the probability-weighted mixture of token embeddings, which form the continuous concept space.<n>In essence, each generated concept token encapsulates multiple meanings from related discrete tokens, implicitly exploring various reasoning paths to converge.
arXiv Detail & Related papers (2025-05-21T17:29:15Z)
Reasoning Models Can Be Effective Without Thinking [45.411955744222524]
We find that bypassing the thinking process via simple prompting, denoted as NoThinking, can be surprisingly effective.<n>Our method outperforms a range of baselines with similar latency using Thinking, and is comparable to Thinking with significantly longer latency (up to 9x)
arXiv Detail & Related papers (2025-04-14T04:08:16Z)
Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning [53.57895922042783]
Large Language Models (LLMs) excel at reasoning and planning when trained on chainof-thought (CoT) data.<n>We propose a hybrid representation of the reasoning process, where we partially abstract away the initial reasoning steps using latent discrete tokens.
arXiv Detail & Related papers (2025-02-05T15:33:00Z)
Training Large Language Models to Reason in a Continuous Latent Space [71.0274000348354]
We introduce a new paradigm called Coconut (Chain of Continuous Thought) to explore the potential of reasoning beyond language.<n>Instead of decoding this state into words, we feed it back to the model as the next input embedding directly in the continuous space.<n>This latent reasoning paradigm enables an advanced reasoning pattern, where continuous thoughts can encode multiple alternative next steps.
arXiv Detail & Related papers (2024-12-09T18:55:56Z)
Markov Chain of Thought for Efficient Mathematical Reasoning [10.678633785012691]
Chain of Thought (CoT) of multi-step benefits from the logical structure of the reasoning steps and task-specific actions.<n>We conceptualize the standard multi-step CoT as a novel Markov Chain of Thought (MCoT)<n>Our MCoT aims to compress previous reasoning steps into a simplified question, enabling efficient next-step inference.
arXiv Detail & Related papers (2024-10-23T07:53:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.