Related papers: Restoring Pruned Large Language Models via Lost Component Compensation

Restoring Pruned Large Language Models via Lost Component Compensation

URL: http://arxiv.org/abs/2510.21834v1
Date: Wed, 22 Oct 2025 10:08:18 GMT
Title: Restoring Pruned Large Language Models via Lost Component Compensation
Authors: Zijian Feng, Hanzhang Zhou, Zixiao Zhu, Tianjiao Li, Jia Jim Deryl Chua, Lee Onn Mak, Gee Wah Ng, Kezhi Mao,
Abstract summary: Pruning is a widely used technique to reduce the size and inference cost of large language models (LLMs)<n>We propose a targeted restoration strategy for pruned models that restores performance while preserving their low cost and high efficiency.<n>Based on this insight, we introduce RestoreLCC, a plug-and-play method that contrastively probes critical attention heads via activation editing, extracts lost components from activation differences, and finally injects them back into the corresponding pruned heads for compensation and recovery.
Score: 30.687574116750337
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Pruning is a widely used technique to reduce the size and inference cost of large language models (LLMs), but it often causes performance degradation. To mitigate this, existing restoration methods typically employ parameter-efficient fine-tuning (PEFT), such as LoRA, to recover the pruned model's performance. However, most PEFT methods are designed for dense models and overlook the distinct properties of pruned models, often resulting in suboptimal recovery. In this work, we propose a targeted restoration strategy for pruned models that restores performance while preserving their low cost and high efficiency. We observe that pruning-induced information loss is reflected in attention activations, and selectively reintroducing components of this information can significantly recover model performance. Based on this insight, we introduce RestoreLCC (Restoring Pruned LLMs via Lost Component Compensation), a plug-and-play method that contrastively probes critical attention heads via activation editing, extracts lost components from activation differences, and finally injects them back into the corresponding pruned heads for compensation and recovery. RestoreLCC is compatible with structured, semi-structured, and unstructured pruning schemes. Extensive experiments demonstrate that RestoreLCC consistently outperforms state-of-the-art baselines in both general and task-specific performance recovery, without compromising the sparsity or inference efficiency of pruned models.

Related papers

Gradually Compacting Large Language Models for Reasoning Like a Boiling Frog [72.4168434368873]
Large Language Models (LLMs) have demonstrated impressive reasoning capabilities, but their substantial size often demands significant computational resources.<n>We propose a gradual compacting method that divides the compression process into multiple fine-grained iterations.<n>This iterative approach-reminiscent of the "boiling frog" effect-enables the model to be progressively compressed without abrupt performance loss.
arXiv Detail & Related papers (2026-02-04T06:56:52Z)
RestoreVAR: Visual Autoregressive Generation for All-in-One Image Restoration [51.77917733024544]
latent diffusion models (LDMs) have improved the perceptual quality of All-in-One image Restoration (AiOR) methods.<n>LDMs suffer from slow inference due to their iterative denoising process, rendering them impractical for time-sensitive applications.<n>Visual autoregressive modeling ( VAR) performs scale-space autoregression and achieves comparable performance to that of state-of-the-art diffusion transformers.
arXiv Detail & Related papers (2025-05-23T15:52:26Z)
Any Image Restoration via Efficient Spatial-Frequency Degradation Adaptation [158.37640586809187]
Restoring any degraded image efficiently via just one model has become increasingly significant.<n>Our approach, termed AnyIR, takes a unified path that leverages inherent similarity across various degradations.<n>To fuse the degradation awareness and the contextualized attention, a spatial-frequency parallel fusion strategy is proposed.
arXiv Detail & Related papers (2025-04-19T09:54:46Z)
PASER: Post-Training Data Selection for Efficient Pruned Large Language Model Recovery [11.20326903218271]
Post-training techniques such as instruction tuning are commonly employed to recover model performance.<n>However, some irrelevant instructions may also introduce negative effects to model capacity recovery.<n>We propose textbfPost-training dtextbfAta textbfSelection method for textbfEfficient pruned large language model textbfRecovery (textbfPASER)
arXiv Detail & Related papers (2025-02-18T07:11:08Z)
DReSS: Data-driven Regularized Structured Streamlining for Large Language Models [30.47317140878219]
Large language models (LLMs) have achieved significant progress across various domains, but their increasing scale results in high computational and memory costs.<n>We propose a novel paradigm that first applies regularization, then prunes, and finally finetunes.<n>By leveraging a small amount of data to regularize the components to be pruned, DReSS explicitly transfers the important information to the remaining parts of the model in advance.
arXiv Detail & Related papers (2025-01-29T14:28:11Z)
Lightweight and Post-Training Structured Pruning for On-Device Large Lanaguage Models [11.93284417365518]
We introduce COMP, a lightweight post-training structured pruning method that employs a hybrid-granularity pruning strategy.<n> COMP improves performance by 6.13% on the LLaMA-2-7B model with a 20% pruning ratio compared to LLM-Pruner.
arXiv Detail & Related papers (2025-01-25T16:03:58Z)
UniRestorer: Universal Image Restoration via Adaptively Estimating Image Degradation at Proper Granularity [79.90839080916913]
We present our UniRestorer with improved restoration performance.<n>Specifically, we perform hierarchical clustering on degradation space, and train a multi-granularity mixture-of-experts (MoE) restoration model.<n>In contrast to existing degradation-agnostic and -aware methods, UniRestorer can leverage degradation estimation to benefit degradation specific restoration.
arXiv Detail & Related papers (2024-12-28T14:09:08Z)
Mixed Degradation Image Restoration via Local Dynamic Optimization and Conditional Embedding [67.57487747508179]
Multiple-in-one image restoration (IR) has made significant progress, aiming to handle all types of single degraded image restoration with a single model. In this paper, we propose a novel multiple-in-one IR model that can effectively restore images with both single and mixed degradations.
arXiv Detail & Related papers (2024-11-25T09:26:34Z)
Reconstruct the Pruned Model without Any Retraining [23.235907813011174]
We introduce the Linear Interpolation-based Adaptive Reconstruction (LIAR) framework, which is both efficient and effective. LIAR does not require back-propagation or retraining and is compatible with various pruning criteria and modules. Our evaluations on benchmarks such as GLUE, SQuAD, WikiText, and common sense reasoning show that LIAR enables a BERT model to maintain 98% accuracy even after removing 50% of its parameters.
arXiv Detail & Related papers (2024-07-18T09:30:44Z)
Boosting Image Restoration via Priors from Pre-trained Models [54.83907596825985]
We learn an additional lightweight module called Pre-Train-Guided Refinement Module (PTG-RM) to refine restoration results of a target restoration network with OSF. PTG-RM effectively enhances restoration performance of various models across different tasks, including low-light enhancement, deraining, deblurring, and denoising.
arXiv Detail & Related papers (2024-03-11T15:11:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.