Revisiting Large Language Model Pruning using Neuron Semantic Attribution
- URL: http://arxiv.org/abs/2503.01542v1
- Date: Mon, 03 Mar 2025 13:52:17 GMT
- Title: Revisiting Large Language Model Pruning using Neuron Semantic Attribution
- Authors: Yizhuo Ding, Xinwei Sun, Yanwei Fu, Guosheng Hu,
- Abstract summary: We conduct evaluations on 24 datasets and 4 tasks using popular pruning methods.<n>We surprisingly find a significant performance drop of existing pruning methods in sentiment classification tasks.<n>We propose Neuron Semantic Attribution, which learns to associate each neuron with specific semantics.
- Score: 63.62836612864512
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Model pruning technique is vital for accelerating large language models by reducing their size and computational requirements. However, the generalizability of existing pruning methods across diverse datasets and tasks remains unclear. Thus, we conduct extensive evaluations on 24 datasets and 4 tasks using popular pruning methods. Based on these evaluations, we find and then investigate that calibration set greatly affect the performance of pruning methods. In addition, we surprisingly find a significant performance drop of existing pruning methods in sentiment classification tasks. To understand the link between performance drop and pruned neurons, we propose Neuron Semantic Attribution, which learns to associate each neuron with specific semantics. This method first makes the unpruned neurons of LLMs explainable.
Related papers
- Neuron-Level Knowledge Attribution in Large Language Models [19.472889262384818]
We propose a static method for pinpointing significant neurons.
Compared to seven other methods, our approach demonstrates superior performance across three metrics.
We also apply our methods to analyze six types of knowledge across both attention and feed-forward network layers.
arXiv Detail & Related papers (2023-12-19T13:23:18Z) - Magnificent Minified Models [0.360953887026184]
This paper concerns itself with the task of taking a large trained neural network and 'compressing' it to be smaller by deleting parameters or entire neurons.
We compare various methods of parameter and neuron selection: dropout-based neuron damage estimation, neuron merging, absolute-value based selection, random selection.
For neuron-level pruning, retraining from scratch did much better in our experiments.
arXiv Detail & Related papers (2023-06-16T21:00:44Z) - Computational and Storage Efficient Quadratic Neurons for Deep Neural
Networks [10.379191500493503]
Experimental results have demonstrated that the proposed quadratic neuron structure exhibits superior computational and storage efficiency across various tasks.
This work introduces an efficient quadratic neuron architecture distinguished by its enhanced utilization of second-order computational information.
arXiv Detail & Related papers (2023-06-10T11:25:31Z) - What Matters In The Structured Pruning of Generative Language Models? [44.86217321428518]
Auto-regressive large language models such as GPT-3 require enormous computational resources to use.
Traditionally, structured pruning methods are employed to reduce resource usage.
We introduce Globally Unique Movement (GUM) to improve the uniqueness of neurons in pruned models.
arXiv Detail & Related papers (2023-02-07T22:05:55Z) - Cross-Model Comparative Loss for Enhancing Neuronal Utility in Language
Understanding [82.46024259137823]
We propose a cross-model comparative loss for a broad range of tasks.
We demonstrate the universal effectiveness of comparative loss through extensive experiments on 14 datasets from 3 distinct NLU tasks.
arXiv Detail & Related papers (2023-01-10T03:04:27Z) - SInGE: Sparsity via Integrated Gradients Estimation of Neuron Relevance [37.82255888371488]
We propose a novel integrated gradient pruning criterion, in which the relevance of each neuron is defined as the integral of the gradient variation on a path towards this neuron removal.
We show through extensive validation on several datasets, architectures as well as pruning scenarios that the proposed method, dubbed SInGE, significantly outperforms existing state-of-the-art pruning methods.
arXiv Detail & Related papers (2022-07-08T18:27:42Z) - Neural Network Pruning Through Constrained Reinforcement Learning [3.2880869992413246]
We propose a general methodology for pruning neural networks.
Our proposed methodology can prune neural networks to respect pre-defined computational budgets.
We prove the effectiveness of our approach via comparison with state-of-the-art methods on standard image classification datasets.
arXiv Detail & Related papers (2021-10-16T11:57:38Z) - Dynamic Neural Diversification: Path to Computationally Sustainable
Neural Networks [68.8204255655161]
Small neural networks with a constrained number of trainable parameters, can be suitable resource-efficient candidates for many simple tasks.
We explore the diversity of the neurons within the hidden layer during the learning process.
We analyze how the diversity of the neurons affects predictions of the model.
arXiv Detail & Related papers (2021-09-20T15:12:16Z) - Sparse Training via Boosting Pruning Plasticity with Neuroregeneration [79.78184026678659]
We study the effect of pruning throughout training from the perspective of pruning plasticity.
We design a novel gradual magnitude pruning (GMP) method, named gradual pruning with zero-cost neuroregeneration (GraNet) and its dynamic sparse training (DST) variant (GraNet-ST)
Perhaps most impressively, the latter for the first time boosts the sparse-to-sparse training performance over various dense-to-sparse methods by a large margin with ResNet-50 on ImageNet.
arXiv Detail & Related papers (2021-06-19T02:09:25Z) - Compositional Explanations of Neurons [52.71742655312625]
We describe a procedure for explaining neurons in deep representations by identifying compositional logical concepts.
We use this procedure to answer several questions on interpretability in models for vision and natural language processing.
arXiv Detail & Related papers (2020-06-24T20:37:05Z) - Towards Efficient Processing and Learning with Spikes: New Approaches
for Multi-Spike Learning [59.249322621035056]
We propose two new multi-spike learning rules which demonstrate better performance over other baselines on various tasks.
In the feature detection task, we re-examine the ability of unsupervised STDP with its limitations being presented.
Our proposed learning rules can reliably solve the task over a wide range of conditions without specific constraints being applied.
arXiv Detail & Related papers (2020-05-02T06:41:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.