Related papers: Training Large Language Models to Reason in a Continuous Latent Space

Training Large Language Models to Reason in a Continuous Latent Space

URL: http://arxiv.org/abs/2412.06769v3
Date: Mon, 03 Nov 2025 00:53:34 GMT
Title: Training Large Language Models to Reason in a Continuous Latent Space
Authors: Shibo Hao, Sainbayar Sukhbaatar, DiJia Su, Xian Li, Zhiting Hu, Jason Weston, Yuandong Tian,
Abstract summary: We introduce a new paradigm called Coconut (Chain of Continuous Thought) to explore the potential of reasoning beyond language.<n>Instead of decoding this state into words, we feed it back to the model as the next input embedding directly in the continuous space.<n>This latent reasoning paradigm enables an advanced reasoning pattern, where continuous thoughts can encode multiple alternative next steps.
Score: 71.0274000348354
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) are typically constrained to reason in the language space, where they express the reasoning process through a chain-of-thought (CoT) to solve complex problems. However, the language space may not always be optimal for reasoning. Most word tokens primarily ensure textual coherence and are not essential for reasoning, while some critical tokens require complex planning and pose challenges to LLMs. To explore the potential of reasoning beyond language, we introduce a new paradigm called Coconut (Chain of Continuous Thought). Coconut utilizes the last hidden state of the LLM as a representation of the reasoning state, termed "continuous thought." Instead of decoding this state into words, we feed it back to the model as the next input embedding directly in the continuous space. This latent reasoning paradigm enables an advanced reasoning pattern, where continuous thoughts can encode multiple alternative next steps, allowing the model to perform a breadth-first search (BFS) rather than committing prematurely to a single deterministic path as in CoT. Coconut outperforms CoT on logical reasoning tasks that require substantial search during planning and achieves a better trade-off between accuracy and efficiency.

Related papers

Latent Reasoning with Supervised Thinking States [60.09942890192309]
Reasoning with a chain-of-thought (CoT) enables Large Language Models (LLMs) to solve complex tasks but incurs significant inference costs.<n>We propose Thinking States, a method that performs reasoning em while the input is processing.<n>We show Thinking States leads to stronger reasoning behavior than CoT, successfully extrapolating to longer sequences than seen during training.
arXiv Detail & Related papers (2026-02-09T07:12:41Z)
CoLT: Reasoning with Chain of Latent Tool Calls [31.228763375347608]
Chain-of-Thought (CoT) is a critical technique in enhancing the reasoning ability of Large Language Models (LLMs)<n>We propose CoLT, a novel framework that implements latent reasoning as tool calls''
arXiv Detail & Related papers (2026-02-04T06:12:53Z)
Latent Chain-of-Thought as Planning: Decoupling Reasoning from Verbalization [9.193078163792427]
Chain-of-Thought (CoT) empowers Large Language Models (LLMs) to tackle complex problems.<n>Recent latent reasoning approaches attempt to optimize efficiency by performing reasoning within continuous hidden states.<n>We introduce PLaT, a framework that reformulates latent reasoning as planning by fundamentally decouple reasoning from verbalization.
arXiv Detail & Related papers (2026-01-29T07:38:18Z)
Neural Chain-of-Thought Search: Searching the Optimal Reasoning Path to Enhance Large Language Models [61.55758048622473]
We introduce Neural Chain-of-Thought Search (NCoTS), a framework that reformulates reasoning as a dynamic search for the optimal thinking strategy.<n>By quantitatively characterizing the solution space, we reveal the existence of sparse superior reasoning paths that are simultaneously more accurate and concise than standard outputs.
arXiv Detail & Related papers (2026-01-16T14:38:18Z)
Reason from Future: Reverse Thought Chain Enhances LLM Reasoning [18.637191592875155]
We propose a novel reasoning paradigm called Reason from Future (RFF)<n>RFF generates reasoning paths by bidirectional reasoning that combines top-down planning with bottom-up reasoning accumulation.<n>RFF outperforms conventional paradigms with higher accuracy and less searching space to solve complex tasks.
arXiv Detail & Related papers (2025-06-04T08:03:17Z)
Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space [62.54887038032942]
We introduce Soft Thinking, a training-free method that emulates human-like "soft" reasoning by generating soft, abstract concept tokens.<n>These concept tokens are created by the probability-weighted mixture of token embeddings, which form the continuous concept space.<n>In essence, each generated concept token encapsulates multiple meanings from related discrete tokens, implicitly exploring various reasoning paths to converge.
arXiv Detail & Related papers (2025-05-21T17:29:15Z)
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching [60.04718679054704]
We introduce Sketch-of-Thought (SoT), a novel prompting framework. It combines cognitive-inspired reasoning paradigms with linguistic constraints to minimize token usage. SoT achieves token reductions of 76% with negligible accuracy impact.
arXiv Detail & Related papers (2025-03-07T06:57:17Z)
SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs [48.28847964704554]
Chain-of-Thought (CoT) reasoning enables Large Language Models (LLMs) to solve complex reasoning tasks. We propose a novel approach for continuous-space reasoning that does not require modifying the underlying LLM.
arXiv Detail & Related papers (2025-02-17T18:52:29Z)
Markov Chain of Thought for Efficient Mathematical Reasoning [10.678633785012691]
Chain of Thought (CoT) of multi-step benefits from the logical structure of the reasoning steps and task-specific actions. We conceptualize the standard multi-step CoT as a novel Markov Chain of Thought (MCoT) Our MCoT aims to compress previous reasoning steps into a simplified question, enabling efficient next-step inference.
arXiv Detail & Related papers (2024-10-23T07:53:29Z)
FLARE: Faithful Logic-Aided Reasoning and Exploration [50.9814063216852]
We introduce a novel approach for traversing the problem space using task decompositions. We use the Large Language Models to plan a solution, soft-formalise the query into facts and predicates using a logic programming code. Our method allows us to compute the faithfulness of the reasoning process w.r.t. the generated code and analyse the steps of the multi-hop search without relying on external solvers.
arXiv Detail & Related papers (2024-10-14T19:39:11Z)
LogicBench: Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models [52.03659714625452]
Recently developed large language models (LLMs) have been shown to perform remarkably well on a wide range of language understanding tasks. But, can they really "reason" over the natural language? This question has been receiving significant research attention and many reasoning skills such as commonsense, numerical, and qualitative have been studied.
arXiv Detail & Related papers (2024-04-23T21:08:49Z)
Implicit Chain of Thought Reasoning via Knowledge Distillation [58.80851216530288]
Instead of explicitly producing the chain of thought reasoning steps, we use the language model's internal hidden states to perform implicit reasoning. We find that this approach enables solving tasks previously not solvable without explicit chain-of-thought, at a speed comparable to no chain-of-thought.
arXiv Detail & Related papers (2023-11-02T17:59:49Z)
Assessing Step-by-Step Reasoning against Lexical Negation: A Case Study on Syllogism [19.590120229602103]
Large language models (LLMs) take advantage of step-by-step reasoning instructions, e.g., chain-of-thought (CoT) prompting. In this study, we inspect the step-by-step reasoning ability of LLMs with a focus on negation.
arXiv Detail & Related papers (2023-10-23T12:40:41Z)
Enhancing Zero-Shot Chain-of-Thought Reasoning in Large Language Models through Logic [19.476840373850653]
Large language models show hallucinations as their reasoning procedures are unconstrained by logical principles. We propose LoT (Logical Thoughts), a self-improvement prompting framework that leverages principles rooted in symbolic logic. Experimental evaluations conducted on language tasks in diverse domains, including arithmetic, commonsense, symbolic, causal inference, and social problems, demonstrate the efficacy of enhanced reasoning by logic.
arXiv Detail & Related papers (2023-09-23T11:21:12Z)
Large Language Models are In-Context Semantic Reasoners rather than Symbolic Reasoners [75.85554779782048]
Large Language Models (LLMs) have excited the natural language and machine learning community over recent years. Despite of numerous successful applications, the underlying mechanism of such in-context capabilities still remains unclear. In this work, we hypothesize that the learned textitsemantics of language tokens do the most heavy lifting during the reasoning process.
arXiv Detail & Related papers (2023-05-24T07:33:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.