When More is Less: Understanding Chain-of-Thought Length in LLMs
- URL: http://arxiv.org/abs/2502.07266v1
- Date: Tue, 11 Feb 2025 05:28:59 GMT
- Title: When More is Less: Understanding Chain-of-Thought Length in LLMs
- Authors: Yuyang Wu, Yifei Wang, Tianqi Du, Stefanie Jegelka, Yisen Wang,
- Abstract summary: Chain-of-thought (CoT) reasoning enhances the multi-step reasoning capabilities of large language models (LLMs)
However, for most models and tasks, does an increase in CoT length consistently lead to improved reasoning accuracy?
In this paper, we observe a nuanced relationship: as the number of reasoning steps increases, performance initially improves but eventually decreases.
- Score: 53.77747102201451
- License:
- Abstract: Chain-of-thought (CoT) reasoning enhances the multi-step reasoning capabilities of large language models (LLMs) by breaking complex tasks into smaller, manageable sub-tasks. Researchers have been exploring ways to guide models to generate more complex CoT processes to improve the reasoning ability of LLMs, such as long CoT and the test-time scaling law. However, for most models and tasks, does an increase in CoT length consistently lead to improved reasoning accuracy? In this paper, we observe a nuanced relationship: as the number of reasoning steps increases, performance initially improves but eventually decreases. To understand this phenomenon, we provide a piece of evidence that longer reasoning processes are increasingly susceptible to noise. We theoretically prove the existence of an optimal CoT length and derive a scaling law for this optimal length based on model capability and task difficulty. Inspired by our theory, we conduct experiments on both synthetic and real world datasets and propose Length-filtered Vote to alleviate the effects of excessively long or short CoTs. Our findings highlight the critical need to calibrate CoT length to align with model capabilities and task demands, offering a principled framework for optimizing multi-step reasoning in LLMs.
Related papers
- Stepwise Perplexity-Guided Refinement for Efficient Chain-of-Thought Reasoning in Large Language Models [56.37421741507468]
Chain-of-Thought (CoT) reasoning has significantly enhanced the performance of large language models (LLMs)
We propose a method to identify critical reasoning steps using perplexity as a measure of their importance.
arXiv Detail & Related papers (2025-02-18T20:04:51Z) - SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs [48.28847964704554]
Chain-of-Thought (CoT) reasoning enables Large Language Models (LLMs) to solve complex reasoning tasks.
We propose a novel approach for continuous-space reasoning that does not require modifying the underlying LLM.
arXiv Detail & Related papers (2025-02-17T18:52:29Z) - LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters! [53.84130385074551]
Large reasoning models (LRMs) tackle complex reasoning problems by following long chain-of-thoughts (Long CoT)
We find that a Large Language model (LLM) can effectively learn Long CoT reasoning through data-efficient supervised fine-tuning (SFT) and parameter-efficient low-rank adaptation (LoRA)
With just 17k long CoT training samples, the Qwen2.5-32B-Instruct model achieves significant improvements on a wide range of math and coding benchmarks.
arXiv Detail & Related papers (2025-02-11T08:48:48Z) - Demystifying Long Chain-of-Thought Reasoning in LLMs [46.352406501403465]
Long chains-of-thought (CoTs) enable strategies like backtracking and error correction.
Reinforcement learning (RL) has emerged as a crucial method for developing these capabilities.
We identify the key factors that enable models to generate long CoT trajectories.
arXiv Detail & Related papers (2025-02-05T17:13:32Z) - O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning [98.3430004984531]
We propose Length-Harmonizing Fine-Tuning (O1-Pruner) to minimize reasoning overhead while maintaining accuracy.
Our code is coming soon at https://github.com/StarDewXXX/O1-Pruner.
arXiv Detail & Related papers (2025-01-22T01:35:11Z) - Markov Chain of Thought for Efficient Mathematical Reasoning [10.678633785012691]
Chain of Thought (CoT) of multi-step benefits from the logical structure of the reasoning steps and task-specific actions.
We conceptualize the standard multi-step CoT as a novel Markov Chain of Thought (MCoT)
arXiv Detail & Related papers (2024-10-23T07:53:29Z) - Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs [37.147529569445396]
Tree-of-thought (ToT) method employs tree-searching to extensively explore the reasoning space and find better reasoning paths that CoT decoding might overlook.
Fine-tuning language models (LLMs) leveraging the search tree constructed by ToT allows CoT to achieve similar or better performance.
This is achieved through Chain of Preference Optimization (CPO), where LLMs are fine-tuned to align each step of the CoT reasoning paths with those of ToT.
arXiv Detail & Related papers (2024-06-13T14:07:02Z) - The Impact of Reasoning Step Length on Large Language Models [40.546685248243534]
Chain of Thought (CoT) is significant in improving the reasoning abilities of large language models.
We investigate the correlation between the effectiveness of CoT and the length of reasoning steps in prompts.
arXiv Detail & Related papers (2024-01-10T04:37:38Z) - Towards Understanding Chain-of-Thought Prompting: An Empirical Study of
What Matters [82.84696222087396]
Chain-of-Thought (CoT) prompting can dramatically improve the multi-step reasoning abilities of large language models (LLMs)
We show that CoT reasoning is possible even with invalid demonstrations.
arXiv Detail & Related papers (2022-12-20T05:20:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.