Bridging Efficiency and Transparency: Explainable CoT Compression in Multimodal Large Reasoning Models
- URL: http://arxiv.org/abs/2602.09485v1
- Date: Tue, 10 Feb 2026 07:29:50 GMT
- Title: Bridging Efficiency and Transparency: Explainable CoT Compression in Multimodal Large Reasoning Models
- Authors: Yizhi Wang, Linan Yue, Min-Ling Zhang,
- Abstract summary: Long chains of thought (Long CoTs) are widely employed in multimodal reasoning models to tackle complex tasks by capturing visual information.<n>Existing approaches face two major challenges: (1) they may compromise the integrity of visual-textual reasoning by removing essential alignment cues, and (2) the compression process lacks explainability.<n>We propose XMCC, an eXplainable Multilanguage CoT that formulates compression as a sequential decision-making process optimized via reinforcement learning.
- Score: 64.34737012956182
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Long chains of thought (Long CoTs) are widely employed in multimodal reasoning models to tackle complex tasks by capturing detailed visual information. However, these Long CoTs are often excessively lengthy and contain redundant reasoning steps, which can hinder inference efficiency. Compressing these long CoTs is a natural solution, yet existing approaches face two major challenges: (1) they may compromise the integrity of visual-textual reasoning by removing essential alignment cues, and (2) the compression process lacks explainability, making it difficult to discern which information is critical. To address these problems, we propose XMCC, an eXplainable Multimodal CoT Compressor that formulates compression as a sequential decision-making process optimized via reinforcement learning. XMCC can effectively shorten reasoning trajectories while preserving key reasoning steps and answer correctness, and simultaneously generates natural-language explanations for its compression decisions. Extensive experiments on representative multimodal reasoning benchmarks demonstrate that XMCC not only reduces reasoning length but also provides explainable explanations, validating its effectiveness.
Related papers
- Compress the Easy, Explore the Hard: Difficulty-Aware Entropy Regularization for Efficient LLM Reasoning [39.72119774004103]
Chain-of-Thought (CoT) has substantially empowered Large Language Models (LLMs) to tackle complex reasoning tasks.<n>The verbose nature of explicit reasoning steps incurs prohibitive inference latency and computational costs, limiting real-world deployment.<n>We propose Compress responses for Easy questions and Explore Hard ones (CEEH), a difficulty-aware approach to RL-based efficient reasoning.
arXiv Detail & Related papers (2026-02-26T05:47:30Z) - Accordion-Thinking: Self-Regulated Step Summaries for Efficient and Readable LLM Reasoning [62.680551162054975]
We introduce an end-to-end framework where LLMs learn to self-regulate the granularity of the reasoning steps through dynamic summarization.<n>We apply reinforcement learning to incentivize this capability further, uncovering a critical insight: the accuracy gap between the highly efficient Fold mode and the exhaustive Unfold mode progressively narrows.<n>Our Accordion-Thinker demonstrates that with learned self-compression, LLMs can tackle complex reasoning tasks with minimal dependency token overhead.
arXiv Detail & Related papers (2026-02-03T08:34:20Z) - DeepCompress: A Dual Reward Strategy for Dynamically Exploring and Compressing Reasoning Chains [56.708381920156256]
Large Reasoning Models (LRMs) have demonstrated impressive capabilities but suffer from cognitive inefficiencies like overthinking'' simple problems and underthinking'' complex ones.<n>This paper introduces textbfDeepCompress, a novel framework that simultaneously enhances both the accuracy and efficiency of LRMs.
arXiv Detail & Related papers (2025-10-31T12:13:11Z) - ConciseHint: Boosting Efficient Reasoning via Continuous Concise Hints during Generation [74.37307916314407]
We propose a framework dubbed ConciseHint, which continuously encourages the reasoning model to speak concisely.<n>Experiments on the state-of-the-art LRMs, including DeepSeek-R1 and Qwen-3 series, demonstrate that our method can effectively produce concise reasoning.
arXiv Detail & Related papers (2025-06-23T16:20:44Z) - Efficient Long CoT Reasoning in Small Language Models [26.579760423359673]
It is challenging to directly train small language models (SLMs) to emerge long chain-of-thought (CoT) reasoning steps.<n>We propose a simple-yet-effective method to prune unnecessary steps in long CoT, and then employ an on-policy method for the SLM itself to curate valid and useful long CoT training data.
arXiv Detail & Related papers (2025-05-24T00:22:52Z) - Fractured Chain-of-Thought Reasoning [61.647243580650446]
We introduce Fractured Sampling, a unified inference-time strategy that interpolates between full CoT and solution-only sampling.<n>We show that Fractured Sampling consistently achieves superior accuracy-cost trade-offs, yielding steep log-linear scaling gains in Pass@k versus token budget.
arXiv Detail & Related papers (2025-05-19T11:30:41Z) - ShorterBetter: Guiding Reasoning Models to Find Optimal Inference Length for Efficient Reasoning [1.0416697066889342]
We propose a simple yet effective reinforcement learning method that enables reasoning models to learn their own optimal CoT lengths without manual supervision.<n>ShorterBetter achieves 50%-80% reduction in output lengths in both in-domain and out-of-domain reasoning tasks.<n>Our reasoning trace analysis shows that ShorterBetter refines the structure of the reasoning traces by reducing unnecessary repetition, excessive self-verification, and over-exploration of alternatives.
arXiv Detail & Related papers (2025-04-30T07:04:19Z) - When More is Less: Understanding Chain-of-Thought Length in LLMs [51.631483479081645]
Large Language Models (LLMs) employ Chain-of-Thought (CoT) reasoning to deconstruct complex problems.<n>This paper argues that longer CoTs are often presumed superior, arguing that longer is not always better.
arXiv Detail & Related papers (2025-02-11T05:28:59Z) - Efficient Reasoning with Hidden Thinking [48.96945580741641]
Chain-of-Thought (CoT) reasoning has become a powerful framework for improving complex problem-solving capabilities.<n>We propose $textbfHeima$ (as hidden llama), an efficient reasoning framework that leverages reasoning CoTs at hidden latent space.<n>Heima model achieves higher generation efficiency while maintaining or even better zero-shot task accuracy.
arXiv Detail & Related papers (2025-01-31T15:10:29Z) - Expediting and Elevating Large Language Model Reasoning via Hidden Chain-of-Thought Decoding [14.175444025026508]
Large language models (LLMs) have demonstrated remarkable capabilities in tasks requiring chain-of-thought (CoT) prompting.
generating the full CoT process results in significantly longer output sequences, leading to increased computational costs and latency during inference.
We propose a novel approach to compress the CoT process through semantic alignment, enabling more efficient decoding while preserving the benefits of CoT reasoning.
arXiv Detail & Related papers (2024-09-13T06:29:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.