CoT-Valve: Length-Compressible Chain-of-Thought Tuning
- URL: http://arxiv.org/abs/2502.09601v1
- Date: Thu, 13 Feb 2025 18:52:36 GMT
- Title: CoT-Valve: Length-Compressible Chain-of-Thought Tuning
- Authors: Xinyin Ma, Guangnian Wan, Runpeng Yu, Gongfan Fang, Xinchao Wang,
- Abstract summary: We introduce a new tuning and inference strategy named CoT-Valve, designed to allow models to generate reasoning chains of varying lengths.
We show that CoT-Valve successfully enables controllability and compressibility of the chain and shows better performance than the prompt-based control.
- Score: 50.196317781229496
- License:
- Abstract: Chain-of-Thought significantly enhances a model's reasoning capability, but it also comes with a considerable increase in inference costs due to long chains. With the observation that the reasoning path can be easily compressed under easy tasks but struggle on hard tasks, we explore the feasibility of elastically controlling the length of reasoning paths with only one model, thereby reducing the inference overhead of reasoning models dynamically based on task difficulty. We introduce a new tuning and inference strategy named CoT-Valve, designed to allow models to generate reasoning chains of varying lengths. To achieve this, we propose to identify a direction in the parameter space that, when manipulated, can effectively control the length of generated CoT. Moreover, we show that this property is valuable for compressing the reasoning chain. We construct datasets with chains from long to short for the same questions and explore two enhanced strategies for CoT-Valve: (1) a precise length-compressible CoT tuning method, and (2) a progressive chain length compression approach. Our experiments show that CoT-Valve successfully enables controllability and compressibility of the chain and shows better performance than the prompt-based control. We applied this method to QwQ-32B-Preview, reducing reasoning chains on GSM8K from 741 to 225 tokens with a minor performance drop (95.07% to 94.92%) and on AIME from 6827 to 4629 tokens, with only one additional incorrect answer.
Related papers
- TokenSkip: Controllable Chain-of-Thought Compression in LLMs [11.583847083770031]
Chain-of-Thought (CoT) has been proven effective in enhancing the reasoning capabilities of large language models (LLMs)
We propose TokenSkip, a simple yet effective approach that enables LLMs to selectively skip less important tokens, allowing for controllable CoT compression.
arXiv Detail & Related papers (2025-02-17T17:37:26Z) - When More is Less: Understanding Chain-of-Thought Length in LLMs [53.77747102201451]
Chain-of-thought (CoT) reasoning enhances the multi-step reasoning capabilities of large language models (LLMs)
However, for most models and tasks, does an increase in CoT length consistently lead to improved reasoning accuracy?
In this paper, we observe a nuanced relationship: as the number of reasoning steps increases, performance initially improves but eventually decreases.
arXiv Detail & Related papers (2025-02-11T05:28:59Z) - C3oT: Generating Shorter Chain-of-Thought without Compromising Effectiveness [18.073777359647515]
Chain-of-Thought (CoT) before deriving the answer can improve the reasoning capabilities of large language models (LLMs)
However, the length of the generated CoT is much longer than the desired final answer, which results in additional decoding costs.
This paper presents a CoT compression framework that involves a compressor to compress an original longer CoT into a shorter CoT.
arXiv Detail & Related papers (2024-12-16T11:12:45Z) - Training Nonlinear Transformers for Chain-of-Thought Inference: A Theoretical Generalization Analysis [82.51626700527837]
Chain-of-shift (CoT) is an efficient method that enables the reasoning ability of large language models by augmenting the query using examples with multiple intermediate steps.
We show that despite the theoretical success of CoT, it fails to provide an accurate generalization when CoT does.
arXiv Detail & Related papers (2024-10-03T03:12:51Z) - A Hopfieldian View-based Interpretation for Chain-of-Thought Reasoning [48.51969964676017]
Chain-of-Thought (CoT) holds a significant place in augmenting the reasoning performance for large language models.
We propose a Read-and-Control approach for controlling the accuracy of CoT.
arXiv Detail & Related papers (2024-06-18T04:07:13Z) - Mitigating Misleading Chain-of-Thought Reasoning with Selective Filtering [59.495717939664246]
Large language models have manifested remarkable capabilities by leveraging chain-of-thought (CoT) reasoning techniques to solve intricate questions.
We propose a novel approach called the selective filtering reasoner (SelF-Reasoner) that assesses the entailment relationship between the question and the candidate reasoning chain.
SelF-Reasoner improves the fine-tuned T5 baseline consistently over the ScienceQA, ECQA, and LastLetter tasks.
arXiv Detail & Related papers (2024-03-28T06:28:35Z) - ChainLM: Empowering Large Language Models with Improved Chain-of-Thought Prompting [124.69672273754144]
Chain-of-Thought (CoT) prompting can enhance the reasoning capabilities of large language models (LLMs)
Existing CoT approaches usually focus on simpler reasoning tasks and thus result in low-quality and inconsistent CoT prompts.
We introduce CoTGenius, a novel framework designed for the automatic generation of superior CoT prompts.
arXiv Detail & Related papers (2024-03-21T11:34:26Z) - Training Chain-of-Thought via Latent-Variable Inference [30.21067593018967]
Large language models (LLMs) solve problems more accurately and interpretably when instructed to work out the answer step by step using a chain-of-thought'' prompt.
Naively combining CoT with supervised tuning requires supervision not just of the correct answers, but also of detailed rationales that lead to those answers.
We propose a fine-tuning strategy that tries to maximize the emphmarginal log-likelihood of generating a correct answer using CoT prompting.
arXiv Detail & Related papers (2023-11-28T17:47:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.