Related papers: Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge

Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge

URL: http://arxiv.org/abs/2601.08808v1
Date: Tue, 13 Jan 2026 18:48:00 GMT
Title: Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge
Authors: Yao Tang, Li Dong, Yaru Hao, Qingxiu Dong, Furu Wei, Jiatao Gu,
Abstract summary: Large language models often solve complex reasoning tasks more effectively with Chain-of-Thought (CoT)<n>Humans, by contrast, often reason softly by maintaining a tractable probability distribution over plausible next steps.<n>We propose Multiplex Thinking, a soft reasoning mechanism that samples K candidate tokens and aggregates their embeddings into a single continuous multiplex token.<n>Multiplex Thinking is self-adaptive: when the model is confident, the multiplex token is nearly discrete and behaves like standard CoT.
Score: 87.51901436392427
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models often solve complex reasoning tasks more effectively with Chain-of-Thought (CoT), but at the cost of long, low-bandwidth token sequences. Humans, by contrast, often reason softly by maintaining a distribution over plausible next steps. Motivated by this, we propose Multiplex Thinking, a stochastic soft reasoning mechanism that, at each thinking step, samples K candidate tokens and aggregates their embeddings into a single continuous multiplex token. This preserves the vocabulary embedding prior and the sampling dynamics of standard discrete generation, while inducing a tractable probability distribution over multiplex rollouts. Consequently, multiplex trajectories can be directly optimized with on-policy reinforcement learning (RL). Importantly, Multiplex Thinking is self-adaptive: when the model is confident, the multiplex token is nearly discrete and behaves like standard CoT; when it is uncertain, it compactly represents multiple plausible next steps without increasing sequence length. Across challenging math reasoning benchmarks, Multiplex Thinking consistently outperforms strong discrete CoT and RL baselines from Pass@1 through Pass@1024, while producing shorter sequences. The code and checkpoints are available at https://github.com/GMLR-Penn/Multiplex-Thinking.

Related papers

Continuous Autoregressive Language Models [56.49239051750678]
We introduce Continuous Autoregressive Language Models (CALM)<n>CALM uses a high-fidelity autoencoder to compress a chunk of K tokens into a single continuous vector.<n>We develop a comprehensive likelihood-free framework that enables robust training, evaluation, and controllable sampling.
arXiv Detail & Related papers (2025-10-31T17:58:11Z)
Rethinking Thinking Tokens: LLMs as Improvement Operators [80.12087211785949]
Reasoning training incentivizes LLMs to produce long chains of thought (long CoT), which allows them to explore solution strategies with self-checking.<n>This results in higher accuracy, but inflates context length, token/compute cost, and answer latency.<n>We ask: Can current models leverage their metacognition to provide other combinations on this Pareto frontier?<n>We identify an interesting inference family Parallel-Distill-Refine (PDR), which performs the following: (i) generate diverse drafts in parallel; (ii) distill them into a bounded, textual workspace; and (iii) refine conditioned on this workspace
arXiv Detail & Related papers (2025-10-01T17:08:59Z)
MARCOS: Deep Thinking by Markov Chain of Continuous Thoughts [82.46857666702924]
We present a new paradigm for reasoning in large language models (LLMs)<n>Instead of autoregressively generating tokens, we model reasoning as a hidden Markov chain of continuous, high-dimensional "thoughts"<n>For the first time, MARCOS achieves performance comparable to token-based CoT, even surpassing it by 4.7% on GSM8K with up to 15.7x speedup in inference.
arXiv Detail & Related papers (2025-09-29T16:44:22Z)
Soft Tokens, Hard Truths [17.640897774014707]
This work introduces a scalable method to learn continuous CoTs via reinforcement learning (RL)<n>We use "soft" tokens: mixtures of tokens together with noise on the input embedding to provide RL exploration.<n>On math reasoning benchmarks with Llama and Qwen models up to 8B, training with continuous CoTs match discrete-token CoTs for pass@1 and surpass them for pass@32.
arXiv Detail & Related papers (2025-09-23T15:43:47Z)
Fractured Chain-of-Thought Reasoning [61.647243580650446]
We introduce Fractured Sampling, a unified inference-time strategy that interpolates between full CoT and solution-only sampling.<n>We show that Fractured Sampling consistently achieves superior accuracy-cost trade-offs, yielding steep log-linear scaling gains in Pass@k versus token budget.
arXiv Detail & Related papers (2025-05-19T11:30:41Z)
Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought [64.43689151961054]
We prove that a two-layer transformer with $D$ steps of continuous CoTs can solve the directed graph reachability problem.<n>In our construction, each continuous thought vector is a superposition state that encodes multiple search frontiers simultaneously.
arXiv Detail & Related papers (2025-05-18T18:36:53Z)
Self-Training Elicits Concise Reasoning in Large Language Models [23.475414693530965]
Chain-of-thought (CoT) reasoning has enabled large language models (LLMs) to utilize additional computation through intermediate tokens.<n>We propose simple fine-tuning methods which leverage self-generated concise reasoning paths.<n>Our method achieves a 30% reduction in output tokens, across five model families on GSM8K and MATH, while maintaining average accuracy.
arXiv Detail & Related papers (2025-02-27T14:14:50Z)
Markov Chain of Thought for Efficient Mathematical Reasoning [10.678633785012691]
Chain of Thought (CoT) of multi-step benefits from the logical structure of the reasoning steps and task-specific actions.<n>We conceptualize the standard multi-step CoT as a novel Markov Chain of Thought (MCoT)<n>Our MCoT aims to compress previous reasoning steps into a simplified question, enabling efficient next-step inference.
arXiv Detail & Related papers (2024-10-23T07:53:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.