SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs
- URL: http://arxiv.org/abs/2502.12134v2
- Date: Tue, 27 May 2025 14:54:51 GMT
- Title: SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs
- Authors: Yige Xu, Xu Guo, Zhiwei Zeng, Chunyan Miao,
- Abstract summary: Chain-of-Thought (CoT) reasoning enables Large Language Models (LLMs) to solve complex reasoning tasks.<n>We propose a novel approach for continuous-space reasoning that does not require modifying the LLM.
- Score: 48.28847964704554
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Chain-of-Thought (CoT) reasoning enables Large Language Models (LLMs) to solve complex reasoning tasks by generating intermediate reasoning steps. However, most existing approaches focus on hard token decoding, which constrains reasoning within the discrete vocabulary space and may not always be optimal. While recent efforts explore continuous-space reasoning, they often require full-model fine-tuning and suffer from catastrophic forgetting, limiting their applicability to state-of-the-art LLMs that already perform well in zero-shot settings with a proper instruction. To address this challenge, we propose a novel approach for continuous-space reasoning that does not require modifying the LLM. Specifically, we employ a lightweight fixed assistant model to speculatively generate instance-specific soft thought tokens as the initial chain of thoughts, which are then mapped into the LLM's representation space via a trainable projection module. Experimental results on five reasoning benchmarks demonstrate that our method enhances LLM reasoning performance through supervised, parameter-efficient fine-tuning. Source code is available at https://github.com/xuyige/SoftCoT.
Related papers
- $\
abla$-Reasoner: LLM Reasoning via Test-Time Gradient Descent in Latent Space [71.23672814629448]
$nabla$-Reasoner is an iterative generation framework that integrates differentiable optimization over token logits into the decoding loop.<n>$nabla$-Reasoner achieves over 20% accuracy improvement on a challenging mathematical reasoning benchmark.
arXiv Detail & Related papers (2026-03-05T08:42:54Z) - Slot Filling as a Reasoning Task for SpeechLLMs [10.898666440393896]
We propose integration of reasoning into speech large language models (speechLLMs) for the end-to-end slot-filling task.<n>Inspired by the recent development of reasoning LLMs, we use a chain-of-thought framework to decompose the slot-filling task into multiple reasoning steps.
arXiv Detail & Related papers (2025-10-22T07:39:56Z) - Latent Reasoning in LLMs as a Vocabulary-Space Superposition [80.01651003144282]
Large language models (LLMs) demonstrate strong reasoning abilities with chain-of-thought prompting, but explicit reasoning introduces substantial computational overhead.<n>Recent work on latent reasoning reduces this cost by reasoning in latent space without explicit supervision, but performance drops significantly.<n>To address this, we restrict the latent space to the column space of the LLM vocabulary, treating latent reasoning as a superposition over vocabulary probabilities.<n>Once latent reasoning concludes, it collapses into an eigenstate of explicit reasoning to yield the final answer.<n>Latent-SFT sets a new state of the art on GSM8k, matching explicit
arXiv Detail & Related papers (2025-10-17T10:51:20Z) - Directional Reasoning Injection for Fine-Tuning MLLMs [51.53222423215055]
Multimodal large language models (MLLMs) are rapidly advancing, yet their reasoning ability often lags behind that of strong text-only counterparts.<n>Existing methods to bridge this gap rely on supervised fine-tuning over large-scale multimodal reasoning data or reinforcement learning.<n>We propose Directional Reasoning Injection for Fine-Tuning (DRIFT) to solve this problem.
arXiv Detail & Related papers (2025-10-16T18:06:46Z) - Thinking on the Fly: Test-Time Reasoning Enhancement via Latent Thought Policy Optimization [5.674809920704963]
Latent Thought Policy Optimization enhances LLM reasoning entirely at test time.<n>Experiments show that LTPO not only matches or surpasses strong baselines on standard tasks but also demonstrates remarkable robustness where others fail.<n>Most notably, on highly challenging AIME benchmarks where existing latent reasoning baselines collapse to near-zero accuracy, LTPO delivers substantial improvements.
arXiv Detail & Related papers (2025-10-05T12:50:39Z) - Skip-Thinking: Chunk-wise Chain-of-Thought Distillation Enable Smaller Language Models to Reason Better and Faster [51.89995713333108]
Chain-of-thought (CoT) distillation allows a large language model (LLM) to guide a small language model (SLM) in reasoning tasks.<n>Existing methods train the SLM to learn the long rationale in one iteration.<n>We propose chunk-wise training (CWT), which uses a search to divide the rationale into internal semantically coherent chunks.
arXiv Detail & Related papers (2025-05-24T11:04:52Z) - Misaligning Reasoning with Answers -- A Framework for Assessing LLM CoT Robustness [3.9930400744726273]
We design a novel evaluation framework, MATCHA, to investigate the relationship between answer and reasoning.<n>In domains like education and healthcare, reasoning is key for model trustworthiness.<n>Our results show that LLMs exhibit greater vulnerability to input perturbations for multi-step and commonsense tasks than compared to logical tasks.
arXiv Detail & Related papers (2025-05-23T02:42:16Z) - Guiding Reasoning in Small Language Models with LLM Assistance [23.3038074903744]
Small Language Models cast doubt suitability for tasks demanding deep, multi-step logical deduction.
This paper introduces a framework called Small Reasons, Large Hints, which selectively augments SLM reasoning with targeted guidance from large language models.
Our experiments on mathematical reasoning datasets demonstrate that targeted external scaffolding significantly improves performance.
arXiv Detail & Related papers (2025-04-14T06:32:45Z) - "Well, Keep Thinking": Enhancing LLM Reasoning with Adaptive Injection Decoding [4.008780119020479]
Large language models (LLMs) exhibit strong reasoning abilities, often attributed to few-shot or zero-shot chain-of-thought (CoT) prompting.
We propose a novel decoding strategy that systematically nudges LLMs to continue reasoning, thereby preventing immature reasoning processes.
arXiv Detail & Related papers (2025-03-13T08:46:32Z) - Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching [60.04718679054704]
Chain-of-Thought prompting elicits step-by-step problem solving, but often at the cost of excessive verbosity in intermediate outputs.<n>We propose Sketch-of-Thought (SoT), a prompting framework that integrates cognitively inspired reasoning paradigms with linguistic constraints.<n>SoT achieves token reductions of up to 78% with minimal accuracy loss across 15 reasoning datasets.
arXiv Detail & Related papers (2025-03-07T06:57:17Z) - Self-Training Elicits Concise Reasoning in Large Language Models [23.475414693530965]
Chain-of-thought (CoT) reasoning has enabled large language models (LLMs) to utilize additional computation through intermediate tokens.<n>We propose simple fine-tuning methods which leverage self-generated concise reasoning paths.<n>Our method achieves a 30% reduction in output tokens, across five model families on GSM8K and MATH, while maintaining average accuracy.
arXiv Detail & Related papers (2025-02-27T14:14:50Z) - InductionBench: LLMs Fail in the Simplest Complexity Class [53.70978746199222]
Large language models (LLMs) have shown remarkable improvements in reasoning.
Inductive reasoning, where one infers the underlying rules from observed data, remains less explored.
We introduce InductionBench, a new benchmark designed to evaluate the inductive reasoning ability of LLMs.
arXiv Detail & Related papers (2025-02-20T03:48:00Z) - CRANE: Reasoning with constrained LLM generation [5.971462597321995]
We propose a reasoning-augmented constrained decoding algorithm, CRANE, which balances correctness of constrained generation with flexibility of unconstrained generation.<n> CRANE significantly outperforms both state-of-the-art constrained decoding strategies and standard unconstrained decoding.
arXiv Detail & Related papers (2025-02-13T08:23:42Z) - Efficient Reasoning with Hidden Thinking [48.96945580741641]
Chain-of-Thought (CoT) reasoning has become a powerful framework for improving complex problem-solving capabilities.<n>We propose $textbfHeima$ (as hidden llama), an efficient reasoning framework that leverages reasoning CoTs at hidden latent space.<n>Heima model achieves higher generation efficiency while maintaining or even better zero-shot task accuracy.
arXiv Detail & Related papers (2025-01-31T15:10:29Z) - Training Large Language Models to Reason in a Continuous Latent Space [84.5618790930725]
We introduce a new paradigm Coconut (Chain of Continuous Thought) to explore the potential of large language models (LLMs) reasoning in an unrestricted latent space.<n>Experiments show that Coconut can effectively augment the LLM on several reasoning tasks.<n>These findings demonstrate the promise of latent reasoning and offer valuable insights for future research.
arXiv Detail & Related papers (2024-12-09T18:55:56Z) - Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding [74.31981011985681]
Large language models (LLMs) have shown impressive capabilities, but still struggle with complex reasoning tasks requiring multiple steps.
We introduce LaTent Reasoning Optimization (LaTRO), a principled framework that formulates reasoning as sampling from a latent distribution.
We validate LaTRO through experiments on GSM8K and ARC-Challenge datasets using multiple model architectures.
arXiv Detail & Related papers (2024-11-06T22:02:30Z) - Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning [53.6472920229013]
Large Language Models (LLMs) have demonstrated impressive capability in many natural language tasks.
LLMs are prone to produce errors, hallucinations and inconsistent statements when performing multi-step reasoning.
We introduce Q*, a framework for guiding LLMs decoding process with deliberative planning.
arXiv Detail & Related papers (2024-06-20T13:08:09Z) - Causal Prompting: Debiasing Large Language Model Prompting based on Front-Door Adjustment [32.12998469814097]
A novel causal prompting method based on front-door adjustment is proposed to effectively mitigate Large Language Models (LLMs) biases.<n> Experimental results show that the proposed causal prompting approach achieves excellent performance across seven natural language processing datasets.
arXiv Detail & Related papers (2024-03-05T07:47:34Z) - Chain-of-Thought Reasoning Without Prompting [40.92854235219315]
CoT reasoning paths can be elicited from pre-trained language models by simply altering the textitdecoding process.
The presence of a CoT in the decoding path correlates with a higher confidence in the model's decoded answer.
arXiv Detail & Related papers (2024-02-15T18:55:41Z) - Large Language Models as an Indirect Reasoner: Contrapositive and Contradiction for Automated Reasoning [74.90592233107712]
We propose a Direct-Indirect Reasoning (DIR) method, which considers Direct Reasoning (DR) and Indirect Reasoning (IR) as multiple parallel reasoning paths that are merged to derive the final answer.<n>Our DIR method is simple yet effective and can be straightforwardly integrated with existing variants of CoT methods.
arXiv Detail & Related papers (2024-02-06T03:41:12Z) - Are LLMs Rigorous Logical Reasoner? Empowering Natural Language Proof Generation with Contrastive Stepwise Decoding [10.421832675327712]
We introduce contrastive decoding to stepwise proof generation, making use of negative reasoning paths to strengthen the model's capacity for logical deduction.<n> Experiments on EntailmentBank underscore the success of our method in augmenting the proof planning abilities of language models.
arXiv Detail & Related papers (2023-11-12T05:12:49Z) - SatLM: Satisfiability-Aided Language Models Using Declarative Prompting [68.40726892904286]
We propose a new satisfiability-aided language modeling (SatLM) approach for improving the reasoning capabilities of large language models (LLMs)
We use an LLM to generate a declarative task specification rather than an imperative program and leverage an off-the-shelf automated theorem prover to derive the final answer.
We evaluate SATLM on 8 different datasets and show that it consistently outperforms program-aided LMs in the imperative paradigm.
arXiv Detail & Related papers (2023-05-16T17:55:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.