Do Latent Tokens Think? A Causal and Adversarial Analysis of Chain-of-Continuous-Thought
- URL: http://arxiv.org/abs/2512.21711v1
- Date: Thu, 25 Dec 2025 15:14:53 GMT
- Title: Do Latent Tokens Think? A Causal and Adversarial Analysis of Chain-of-Continuous-Thought
- Authors: Yuyi Zhang, Boyu Tang, Tianjie Ju, Sufeng Duan, Gongshen Liu,
- Abstract summary: We focus on Chain-of-Continuous-Thought (COCONUT) which claims better efficiency and stability than explicit Chain-of-Thought (CoT)<n>Unlike CoT tokens, COCONUT tokens show minimal sensitivity to steering and lack reasoning-critical information.<n>Results on MMLU and HotpotQA demonstrate that COCONUT consistently exploits dataset artifacts, inflating benchmark performance without true reasoning.
- Score: 16.907732581097417
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Latent tokens are gaining attention for enhancing reasoning in large language models (LLMs), yet their internal mechanisms remain unclear. This paper examines the problem from a reliability perspective, uncovering fundamental weaknesses: latent tokens function as uninterpretable placeholders rather than encoding faithful reasoning. While resistant to perturbation, they promote shortcut usage over genuine reasoning. We focus on Chain-of-Continuous-Thought (COCONUT), which claims better efficiency and stability than explicit Chain-of-Thought (CoT) while maintaining performance. We investigate this through two complementary approaches. First, steering experiments perturb specific token subsets, namely COCONUT and explicit CoT. Unlike CoT tokens, COCONUT tokens show minimal sensitivity to steering and lack reasoning-critical information. Second, shortcut experiments evaluate models under biased and out-of-distribution settings. Results on MMLU and HotpotQA demonstrate that COCONUT consistently exploits dataset artifacts, inflating benchmark performance without true reasoning. These findings reposition COCONUT as a pseudo-reasoning mechanism: it generates plausible traces that conceal shortcut dependence rather than faithfully representing reasoning processes.
Related papers
- Imagination Helps Visual Reasoning, But Not Yet in Latent Space [65.80396132375571]
We investigate the validity of latent reasoning using Causal Mediation Analysis.<n>We show that latent tokens encode limited visual information and exhibit high similarity.<n>We propose a straightforward alternative named CapImagine, which teaches the model to explicitly imagine using text.
arXiv Detail & Related papers (2026-02-26T08:56:23Z) - CoLT: Reasoning with Chain of Latent Tool Calls [31.228763375347608]
Chain-of-Thought (CoT) is a critical technique in enhancing the reasoning ability of Large Language Models (LLMs)<n>We propose CoLT, a novel framework that implements latent reasoning as tool calls''
arXiv Detail & Related papers (2026-02-04T06:12:53Z) - Chain Of Thought Compression: A Theoritical Analysis [24.613200477865572]
Chain-of-Thought (CoT) has unlocked advanced reasoning abilities of Large Language Models.<n>CoT incurs prohibitive computational costs due to generation of extra tokens.<n>Recent studies show that compressing reasoning steps into latent states, or implicit CoT compression, offers a token-efficient alternative.
arXiv Detail & Related papers (2026-01-29T11:42:03Z) - SemCoT: Accelerating Chain-of-Thought Reasoning through Semantically-Aligned Implicit Tokens [43.78883511257627]
Chain-of-Thought (CoT) reasoning hinders its mass deployment in efficiency-critical applications.<n>We propose a novel semantically-aligned implicit CoT framework termed SemCoT.
arXiv Detail & Related papers (2025-10-28T20:11:54Z) - Latent Reasoning in LLMs as a Vocabulary-Space Superposition [80.01651003144282]
Large language models (LLMs) demonstrate strong reasoning abilities with chain-of-thought prompting, but explicit reasoning introduces substantial computational overhead.<n>Recent work on latent reasoning reduces this cost by reasoning in latent space without explicit supervision, but performance drops significantly.<n>To address this, we restrict the latent space to the column space of the LLM vocabulary, treating latent reasoning as a superposition over vocabulary probabilities.<n>Once latent reasoning concludes, it collapses into an eigenstate of explicit reasoning to yield the final answer.<n>Latent-SFT sets a new state of the art on GSM8k, matching explicit
arXiv Detail & Related papers (2025-10-17T10:51:20Z) - One Token Embedding Is Enough to Deadlock Your Large Reasoning Model [91.48868589442837]
We present the Deadlock Attack, a resource exhaustion method that hijacks an LRM's generative control flow.<n>Our method achieves a 100% attack success rate across four advanced LRMs.
arXiv Detail & Related papers (2025-10-12T07:42:57Z) - MARCOS: Deep Thinking by Markov Chain of Continuous Thoughts [82.46857666702924]
We present a new paradigm for reasoning in large language models (LLMs)<n>Instead of autoregressively generating tokens, we model reasoning as a hidden Markov chain of continuous, high-dimensional "thoughts"<n>For the first time, MARCOS achieves performance comparable to token-based CoT, even surpassing it by 4.7% on GSM8K with up to 15.7x speedup in inference.
arXiv Detail & Related papers (2025-09-29T16:44:22Z) - SIM-CoT: Supervised Implicit Chain-of-Thought [108.30049193668083]
Implicit Chain-of-Thought (CoT) methods offer a token-efficient alternative to explicit CoT reasoning in Large Language Models.<n>We identify a core latent instability issue when scaling the computational budget of implicit CoT.<n>We propose SIM-CoT, a plug-and-play training module that introduces step-level supervision to stabilize and enrich the latent reasoning space.
arXiv Detail & Related papers (2025-09-24T17:01:32Z) - Analysing Chain of Thought Dynamics: Active Guidance or Unfaithful Post-hoc Rationalisation? [32.02698064940949]
Chain-of-Thought (CoT) often yields limited gains for soft-reasoning problems.<n>We investigate the dynamics and faithfulness of CoT in soft-reasoning tasks across instruction-tuned, reasoning and reasoning-distilled models.
arXiv Detail & Related papers (2025-08-27T12:25:29Z) - Unveiling Confirmation Bias in Chain-of-Thought Reasoning [12.150655660758359]
Chain-of-thought (CoT) prompting has been widely adopted to enhance the reasoning capabilities of large language models (LLMs)<n>This work presents a novel perspective to understand CoT behavior through the lens of textitconfirmation bias in cognitive psychology.
arXiv Detail & Related papers (2025-06-14T01:30:17Z) - Efficient Reasoning via Chain of Unconscious Thought [40.82356218832031]
Large Reasoning Models (LRMs) achieve promising performance but compromise token efficiency due to verbose reasoning processes.<n>We propose a new reasoning paradigm, termed Chain of Unconscious Thought (CoUT), to improve the token efficiency of LRMs.<n>Our work reveals that models may possess beneficial unconscious thought, enabling improved efficiency without sacrificing performance.
arXiv Detail & Related papers (2025-05-26T09:34:04Z) - Efficient Inference for Large Reasoning Models: A Survey [74.17203483365171]
Large Reasoning Models (LRMs) significantly improve the reasoning ability of Large Language Models (LLMs) by learning to reason.<n>However, their deliberative reasoning process leads to inefficiencies in token usage, memory consumption, and inference time.<n>This survey provides a review of efficient inference methods designed specifically for LRMs, focusing on mitigating token inefficiency while preserving the reasoning quality.
arXiv Detail & Related papers (2025-03-29T13:27:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.