Related papers: Improving Latent Reasoning in LLMs via Soft Concept Mixing

Improving Latent Reasoning in LLMs via Soft Concept Mixing

URL: http://arxiv.org/abs/2511.16885v1
Date: Fri, 21 Nov 2025 01:43:28 GMT
Title: Improving Latent Reasoning in LLMs via Soft Concept Mixing
Authors: Kang Wang, Xiangyu Duan, Tianyi Du,
Abstract summary: Large language models (LLMs) typically reason by generating discrete tokens.<n>We propose Soft Concept Mixing (SCM), a soft concept aware training scheme.<n>SCM exposes the model to soft representations during training.
Score: 5.230565644173722
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Unlike human reasoning in abstract conceptual spaces, large language models (LLMs) typically reason by generating discrete tokens, which potentially limit their expressive power. The recent work Soft Thinking has shown that LLMs' latent reasoning via soft concepts is a promising direction, but LLMs are trained on discrete tokens. To reduce this gap between the soft concepts in reasoning and the discrete tokens in training, we propose Soft Concept Mixing (SCM), a soft concept aware training scheme that directly exposes the model to soft representations during training. Specifically, SCM constructs a soft concept vector by forming a probability-weighted average of embeddings. Then, this vector is mixed into the model's hidden states, which embody rich contextual information. Finally, the entire latent reasoning process is optimized with Reinforcement Learning (RL). Experiments on five reasoning benchmarks demonstrate that SCM improves the reasoning performance of LLMs, and simultaneously maintains a stable training dynamic.

Related papers

Accordion-Thinking: Self-Regulated Step Summaries for Efficient and Readable LLM Reasoning [62.680551162054975]
We introduce an end-to-end framework where LLMs learn to self-regulate the granularity of the reasoning steps through dynamic summarization.<n>We apply reinforcement learning to incentivize this capability further, uncovering a critical insight: the accuracy gap between the highly efficient Fold mode and the exhaustive Unfold mode progressively narrows.<n>Our Accordion-Thinker demonstrates that with learned self-compression, LLMs can tackle complex reasoning tasks with minimal dependency token overhead.
arXiv Detail & Related papers (2026-02-03T08:34:20Z)
Concept Component Analysis: A Principled Approach for Concept Extraction in LLMs [51.378834857406325]
Mechanistic interpretability seeks to mitigate the issues through extracts from large language models.<n>Sparse autoencoders (SAEs) have emerged as a popular approach for extracting interpretable and monosemantic concepts.<n>We show that SAEs suffer from a fundamental theoretical ambiguity: the well-defined correspondence between LLM representations and human-interpretable concepts remains unclear.
arXiv Detail & Related papers (2026-01-28T09:27:05Z)
Multi-Path Collaborative Reasoning via Reinforcement Learning [54.8518809800168]
Chain-of-Thought (CoT) reasoning has significantly advanced the problem-solving capabilities of Large Language Models (LLMs)<n>Recent methods attempt to address this by generating soft abstract tokens to enable reasoning in a continuous semantic space.<n>We propose Multi-Path Perception Policy Optimization (M3PO), a novel reinforcement learning framework that explicitly injects collective insights into the reasoning process.
arXiv Detail & Related papers (2025-12-01T10:05:46Z)
LaDiR: Latent Diffusion Enhances LLMs for Text Reasoning [30.62691333490551]
Large Language Models (LLMs) demonstrate their reasoning ability through chain-of-thought generation.<n>We propose LaDiR, a novel reasoning framework that unifies the expressiveness of continuous latent representation.<n>LaDiR consistently improves accuracy, diversity, and interpretability over existing autoregressive, diffusion-based, and latent reasoning methods.
arXiv Detail & Related papers (2025-10-06T08:15:03Z)
LLMs are Single-threaded Reasoners: Demystifying the Working Mechanism of Soft Thinking [25.468889616586363]
We investigate the Soft Thinking capabilities of large language models (LLMs)<n>Contrary to the prevailing belief that Soft Thinking supports parallel exploration of diverse reasoning paths, our findings reveal that LLMs behave as single-threaded reasoners.<n>Our experiments demonstrate that randomness--particularly with the Gumbel-max trick--can alleviate the limitations of vanilla approaches.
arXiv Detail & Related papers (2025-08-05T13:38:33Z)
Hybrid Latent Reasoning via Reinforcement Learning [50.6763762323985]
We explore latent reasoning by leveraging the capabilities of large language models (LLMs) via reinforcement learning (RL)<n>We introduce hybrid reasoning policy optimization (HRPO), an RL-based hybrid latent reasoning approach that integrates prior hidden states into sampled tokens with a learnable gating mechanism.<n>HRPO-trained LLMs remain interpretable and exhibit intriguing behaviors like cross-lingual patterns and shorter completion lengths.
arXiv Detail & Related papers (2025-05-24T01:26:16Z)
Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space [62.54887038032942]
We introduce Soft Thinking, a training-free method that emulates human-like "soft" reasoning by generating soft, abstract concept tokens.<n>These concept tokens are created by the probability-weighted mixture of token embeddings, which form the continuous concept space.<n>In essence, each generated concept token encapsulates multiple meanings from related discrete tokens, implicitly exploring various reasoning paths to converge.
arXiv Detail & Related papers (2025-05-21T17:29:15Z)
SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs [48.28847964704554]
Chain-of-Thought (CoT) reasoning enables Large Language Models (LLMs) to solve complex reasoning tasks.<n>We propose a novel approach for continuous-space reasoning that does not require modifying the LLM.
arXiv Detail & Related papers (2025-02-17T18:52:29Z)
Sparsity-Guided Holistic Explanation for LLMs with Interpretable Inference-Time Intervention [53.896974148579346]
Large Language Models (LLMs) have achieved unprecedented breakthroughs in various natural language processing domains. The enigmatic black-box'' nature of LLMs remains a significant challenge for interpretability, hampering transparent and accountable applications. We propose a novel methodology anchored in sparsity-guided techniques, aiming to provide a holistic interpretation of LLMs.
arXiv Detail & Related papers (2023-12-22T19:55:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.