Related papers: Long or short CoT? Investigating Instance-level Switch of Large Reasoning Models

Long or short CoT? Investigating Instance-level Switch of Large Reasoning Models

URL: http://arxiv.org/abs/2506.04182v1
Date: Wed, 04 Jun 2025 17:28:38 GMT
Title: Long or short CoT? Investigating Instance-level Switch of Large Reasoning Models
Authors: Ruiqi Zhang, Changyi Xiao, Yixin Cao,
Abstract summary: Chain-of-Thought (CoT) prompting has demonstrated strong performance on complex tasks.<n>Long CoT can lead to performance improvements, but its benefits are often marginal relative to its significantly higher token consumption.<n>We propose SwitchCoT, an automatic framework that adaptively chooses between long and short CoT strategies to balance reasoning accuracy and computational efficiency.
Score: 11.257865157523446
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the rapid advancement of large reasoning models, long Chain-of-Thought (CoT) prompting has demonstrated strong performance on complex tasks. However, this often comes with a significant increase in token usage. In this paper, we conduct a comprehensive empirical analysis comparing long and short CoT strategies. Our findings reveal that while long CoT can lead to performance improvements, its benefits are often marginal relative to its significantly higher token consumption. Specifically, long CoT tends to outperform when ample generation budgets are available, whereas short CoT is more effective under tighter budget constraints. These insights underscore the need for a dynamic approach that selects the proper CoT strategy based on task context and resource availability. To address this, we propose SwitchCoT, an automatic framework that adaptively chooses between long and short CoT strategies to balance reasoning accuracy and computational efficiency. Moreover, SwitchCoT is designed to be budget-aware, making it broadly applicable across scenarios with varying resource constraints. Experimental results demonstrate that SwitchCoT can reduce inference costs by up to 50% while maintaining high accuracy. Notably, under limited token budgets, it achieves performance comparable to, or even exceeding, that of using either long or short CoT alone.

Related papers

R-Stitch: Dynamic Trajectory Stitching for Efficient Reasoning [60.37610817226533]
Chain-of-thought (CoT) reasoning encourages step-by-step intermediate reasoning during inference.<n>CoT introduces substantial computational overhead due to its reliance on autoregressive decoding over long token sequences.<n>We present R-Stitch, a token-level, confidence-based hybrid decoding framework that accelerates CoT inference.
arXiv Detail & Related papers (2025-07-23T08:14:36Z)
QFFT, Question-Free Fine-Tuning for Adaptive Reasoning [46.60300066127707]
Question-Free Fine-Tuning (QFFT) is a fine-tuning approach that removes the input question during training and learns exclusively from Long CoT responses.<n>QFFT reduces average response length by more than 50%, while achieving performance comparable to Supervised Fine-Tuning (SFT)
arXiv Detail & Related papers (2025-06-15T14:21:28Z)
Token Signature: Predicting Chain-of-Thought Gains with Token Decoding Feature in Large Language Models [9.282278040339138]
Chain-of-Thought (CoT) technique has proven effective in improving the performance of large language models (LLMs) on complex reasoning tasks.<n>We make a preliminary observation that the monotonicity of token probability distributions may be correlated with the gains achieved through CoT reasoning.<n>We propose two indicators based on the token probability distribution to assess CoT effectiveness across different tasks.
arXiv Detail & Related papers (2025-06-06T11:53:27Z)
Efficient Long CoT Reasoning in Small Language Models [26.579760423359673]
It is challenging to directly train small language models (SLMs) to emerge long chain-of-thought (CoT) reasoning steps.<n>We propose a simple-yet-effective method to prune unnecessary steps in long CoT, and then employ an on-policy method for the SLM itself to curate valid and useful long CoT training data.
arXiv Detail & Related papers (2025-05-24T00:22:52Z)
AdaCoT: Pareto-Optimal Adaptive Chain-of-Thought Triggering via Reinforcement Learning [30.265984245328124]
Chain-of-Thought prompting indiscriminately generates lengthy reasoning steps for all queries.<n>AdaCoT (Adaptive Chain-of-Thought) is a novel framework enabling LLMs to adaptively decide when to invoke CoT.<n>A key technical contribution is Selective Loss Masking (SLM), designed to counteract decision boundary collapse.
arXiv Detail & Related papers (2025-05-17T08:27:00Z)
Unlocking General Long Chain-of-Thought Reasoning Capabilities of Large Language Models via Representation Engineering [59.34894142132706]
Existing work finds that the capability of long CoT reasoning can be efficiently elicited by tuning on only a few examples.<n>This motivates us to investigate whether long CoT reasoning is a general capability for LLMs.<n>We propose GLoRE, a novel representation engineering method to unleash the general long CoT reasoning capabilities of LLMs.
arXiv Detail & Related papers (2025-03-14T11:30:37Z)
CoT-Valve: Length-Compressible Chain-of-Thought Tuning [50.196317781229496]
We introduce a new tuning and inference strategy named CoT-Valve, designed to allow models to generate reasoning chains of varying lengths.<n>We show that CoT-Valve successfully enables controllability and compressibility of the chain and shows better performance than the prompt-based control.
arXiv Detail & Related papers (2025-02-13T18:52:36Z)
When More is Less: Understanding Chain-of-Thought Length in LLMs [51.631483479081645]
Large Language Models (LLMs) employ Chain-of-Thought (CoT) reasoning to deconstruct complex problems.<n>This paper argues that longer CoTs are often presumed superior, arguing that longer is not always better.
arXiv Detail & Related papers (2025-02-11T05:28:59Z)
C3oT: Generating Shorter Chain-of-Thought without Compromising Effectiveness [18.073777359647515]
Chain-of-Thought (CoT) before deriving the answer can improve the reasoning capabilities of large language models (LLMs)<n>However, the length of the generated CoT is much longer than the desired final answer, which results in additional decoding costs.<n>This paper presents a CoT compression framework that involves a compressor to compress an original longer CoT into a shorter CoT.
arXiv Detail & Related papers (2024-12-16T11:12:45Z)
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning [55.52872152909785]
Chain-of-thought (CoT) via prompting is the de facto method for eliciting reasoning capabilities from large language models (LLMs)<n>We show that CoT gives strong performance benefits primarily on tasks involving math or logic, with much smaller gains on other types of tasks.
arXiv Detail & Related papers (2024-09-18T17:55:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.