CoT-X: An Adaptive Framework for Cross-Model Chain-of-Thought Transfer and Optimization
- URL: http://arxiv.org/abs/2511.05747v1
- Date: Fri, 07 Nov 2025 22:35:31 GMT
- Title: CoT-X: An Adaptive Framework for Cross-Model Chain-of-Thought Transfer and Optimization
- Authors: Ziqian Bi, Kaijie Chen, Tianyang Wang, Junfeng Hao, Xinyuan Song,
- Abstract summary: Chain-of-Thought (CoT) reasoning enhances the problem-solving ability of large language models (LLMs) but leads to substantial inference overhead.<n>This paper investigates efficient CoT transfer across models of different scales and architectures through an adaptive reasoning summarization framework.
- Score: 5.857877898558651
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Chain-of-Thought (CoT) reasoning enhances the problem-solving ability of large language models (LLMs) but leads to substantial inference overhead, limiting deployment in resource-constrained settings. This paper investigates efficient CoT transfer across models of different scales and architectures through an adaptive reasoning summarization framework. The proposed method compresses reasoning traces via semantic segmentation with importance scoring, budget-aware dynamic compression, and coherence reconstruction, preserving critical reasoning steps while significantly reducing token usage. Experiments on 7{,}501 medical examination questions across 10 specialties show up to 40% higher accuracy than truncation under the same token budgets. Evaluations on 64 model pairs from eight LLMs (1.5B-32B parameters, including DeepSeek-R1 and Qwen3) confirm strong cross-model transferability. Furthermore, a Gaussian Process-based Bayesian optimization module reduces evaluation cost by 84% and reveals a power-law relationship between model size and cross-domain robustness. These results demonstrate that reasoning summarization provides a practical path toward efficient CoT transfer, enabling advanced reasoning under tight computational constraints. Code will be released upon publication.
Related papers
- Towards Efficient Large Language Reasoning Models via Extreme-Ratio Chain-of-Thought Compression [55.63153956934198]
Chain-of-Thought (CoT) reasoning successfully enhances the reasoning capabilities of Large Language Models (LLMs)<n>Existing CoT compression methods often suffer from a critical loss of logical fidelity at high compression ratios.<n>We propose a novel EXTreme-RAtio Chain-of-Thought Compression framework, termed Extra-CoT, which aggressively reduces the token budget while preserving answer accuracy.
arXiv Detail & Related papers (2026-02-09T06:57:15Z) - ConMax: Confidence-Maximizing Compression for Efficient Chain-of-Thought Reasoning [46.481679150652205]
Large Reasoning Models generate redundant reasoning paths that inflate computational costs without improving accuracy.<n>In this paper, we introduce ConMax, a novel reinforcement learning framework designed to automatically compress reasoning traces.<n>Experiments across five reasoning datasets demonstrate that ConMax achieves a superior efficiency-performance trade-off.
arXiv Detail & Related papers (2026-01-08T14:22:58Z) - Correct, Concise and Complete: Multi-stage Training For Adaptive Reasoning [11.179446105672461]
We propose a multi-stage efficient reasoning method that combines supervised fine-tuning and reinforcement learning.<n>Our approach reduces response length by an average of 28% for 8B models and 40% for 32B models.<n>It achieves a superior trade-off compared to more complex state-of-the-art efficient reasoning methods.
arXiv Detail & Related papers (2026-01-06T12:31:51Z) - RaCoT: Plug-and-Play Contrastive Example Generation Mechanism for Enhanced LLM Reasoning Reliability [12.67288560758937]
We propose RaCoT (Retrieval-aware Contrastive-of-Thought), a novel framework that shifts contrastive thinking to the pre-retrieval stage.<n>RaCoT guides the model to proactively focus on the critical details that determine answer divergence"
arXiv Detail & Related papers (2025-10-26T15:06:44Z) - Teaching Language Models to Reason with Tools [73.21700643314917]
We present emphHint-Engineering, a new data synthesis strategy that strategically injects diverse hints at optimal points within reasoning paths.<n>CoRT significantly enhances efficiency, reducing token usage by approximately 30% for the 32B model and 50% for the 1.5B model.
arXiv Detail & Related papers (2025-10-23T08:41:44Z) - DeepPrune: Parallel Scaling without Inter-trace Redundancy [53.62015294143274]
Over 80% of parallel reasoning traces yield identical final answers, representing substantial wasted computation.<n>We propose DeepPrune, a novel framework that enables efficient parallel scaling through dynamic pruning.<n>Our work establishes a new standard for efficient parallel reasoning, making high-performance reasoning more efficient.
arXiv Detail & Related papers (2025-10-09T17:24:54Z) - Think Right: Learning to Mitigate Under-Over Thinking via Adaptive, Attentive Compression [68.69801176669843]
We propose an online post-training RL method that prunes redundant steps and estimates difficulty.<n> TRAAC (Think Right with Adaptive, Attentive Compression) achieves an average absolute accuracy gain of 8.4%.<n>Although our models are trained on math datasets, they show accuracy and efficiency gains on out-of-distribution non-math datasets.
arXiv Detail & Related papers (2025-10-02T02:00:20Z) - R-Stitch: Dynamic Trajectory Stitching for Efficient Reasoning [80.104336426172]
Chain-of-thought (CoT) enhances problem-solving ability of large language models.<n>CoT incurs substantial inference cost due to long autoregressive trajectories.<n>We introduce R-Stitch, a training-free hybrid decoding framework.
arXiv Detail & Related papers (2025-07-23T08:14:36Z) - ConciseHint: Boosting Efficient Reasoning via Continuous Concise Hints during Generation [74.37307916314407]
We propose a framework dubbed ConciseHint, which continuously encourages the reasoning model to speak concisely.<n>Experiments on the state-of-the-art LRMs, including DeepSeek-R1 and Qwen-3 series, demonstrate that our method can effectively produce concise reasoning.
arXiv Detail & Related papers (2025-06-23T16:20:44Z) - TrimR: Verifier-based Training-Free Thinking Compression for Efficient Test-Time Scaling [20.980976778470247]
Large Reasoning Models (LRMs) demonstrate exceptional capability in tackling complex mathematical, logical, and coding tasks.<n>We propose TrimR, a verifier-based, training-free, efficient framework for dynamic Chain-of-Thought (CoT) compression.
arXiv Detail & Related papers (2025-05-22T12:23:30Z) - Fractured Chain-of-Thought Reasoning [61.647243580650446]
We introduce Fractured Sampling, a unified inference-time strategy that interpolates between full CoT and solution-only sampling.<n>We show that Fractured Sampling consistently achieves superior accuracy-cost trade-offs, yielding steep log-linear scaling gains in Pass@k versus token budget.
arXiv Detail & Related papers (2025-05-19T11:30:41Z) - AdaCoT: Pareto-Optimal Adaptive Chain-of-Thought Triggering via Reinforcement Learning [30.265984245328124]
Chain-of-Thought prompting indiscriminately generates lengthy reasoning steps for all queries.<n>AdaCoT (Adaptive Chain-of-Thought) is a novel framework enabling LLMs to adaptively decide when to invoke CoT.<n>A key technical contribution is Selective Loss Masking (SLM), designed to counteract decision boundary collapse.
arXiv Detail & Related papers (2025-05-17T08:27:00Z) - Learning Adaptive Parallel Reasoning with Language Models [70.1745752819628]
We propose Adaptive Parallel Reasoning (APR), a novel reasoning framework that enables language models to orchestrate both serialized and parallel computations end-to-end.<n> APR generalizes existing reasoning methods by enabling adaptive multi-threaded inference using spawn() and join() operations.<n>A key innovation is our end-to-end reinforcement learning strategy, optimizing both parent and child inference threads to enhance task success rate without requiring predefined reasoning structures.
arXiv Detail & Related papers (2025-04-21T22:29:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.