Inf-CP: A Reliable Channel Pruning based on Channel Influence
- URL: http://arxiv.org/abs/2112.02521v1
- Date: Sun, 5 Dec 2021 09:30:43 GMT
- Title: Inf-CP: A Reliable Channel Pruning based on Channel Influence
- Authors: Bilan Lai, Haoran Xiang, Furao Shen
- Abstract summary: One of the most effective methods of channel pruning is to trim on the basis of the importance of each neuron.
Previous works have proposed to trim by considering the statistics of a single layer or a plurality of successive layers of neurons.
We propose to use ensemble learning to train a model for different batches of data.
- Score: 4.692400531340393
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One of the most effective methods of channel pruning is to trim on the basis
of the importance of each neuron. However, measuring the importance of each
neuron is an NP-hard problem. Previous works have proposed to trim by
considering the statistics of a single layer or a plurality of successive
layers of neurons. These works cannot eliminate the influence of different data
on the model in the reconstruction error, and currently, there is no work to
prove that the absolute values of the parameters can be directly used as the
basis for judging the importance of the weights. A more reasonable approach is
to eliminate the difference between batch data that accurately measures the
weight of influence. In this paper, we propose to use ensemble learning to
train a model for different batches of data and use the influence function (a
classic technique from robust statistics) to learn the algorithm to track the
model's prediction and return its training parameter gradient, so that we can
determine the responsibility for each parameter, which we call "influence", in
the prediction process. In addition, we theoretically prove that the
back-propagation of the deep network is a first-order Taylor approximation of
the influence function of the weights. We perform extensive experiments to
prove that pruning based on the influence function using the idea of ensemble
learning will be much more effective than just focusing on error
reconstruction. Experiments on CIFAR shows that the influence pruning achieves
the state-of-the-art result.
Related papers
- Enhancing Training Data Attribution for Large Language Models with Fitting Error Consideration [74.09687562334682]
We introduce a novel training data attribution method called Debias and Denoise Attribution (DDA)
Our method significantly outperforms existing approaches, achieving an averaged AUC of 91.64%.
DDA exhibits strong generality and scalability across various sources and different-scale models like LLaMA2, QWEN2, and Mistral.
arXiv Detail & Related papers (2024-10-02T07:14:26Z) - Characterizing the Influence of Graph Elements [24.241010101383505]
The influence function of graph convolution networks (GCNs) can shed light on the effects of removing training nodes/edges from an input graph.
We show that the influence function of an SGC model could be used to estimate the impact of removing training nodes/edges on the test performance of the SGC without re-training the model.
arXiv Detail & Related papers (2022-10-14T01:04:28Z) - If Influence Functions are the Answer, Then What is the Question? [7.873458431535409]
Influence functions efficiently estimate the effect of removing a single training data point on a model's learned parameters.
While influence estimates align well with leave-one-out retraining for linear models, recent works have shown this alignment is often poor in neural networks.
arXiv Detail & Related papers (2022-09-12T16:17:43Z) - Causal Effect Estimation using Variational Information Bottleneck [19.6760527269791]
Causal inference is to estimate the causal effect in a causal relationship when intervention is applied.
We propose a method to estimate Causal Effect by using Variational Information Bottleneck (CEVIB)
arXiv Detail & Related papers (2021-10-26T13:46:12Z) - Causal Inference Under Unmeasured Confounding With Negative Controls: A
Minimax Learning Approach [84.29777236590674]
We study the estimation of causal parameters when not all confounders are observed and instead negative controls are available.
Recent work has shown how these can enable identification and efficient estimation via two so-called bridge functions.
arXiv Detail & Related papers (2021-03-25T17:59:19Z) - FastIF: Scalable Influence Functions for Efficient Model Interpretation
and Debugging [112.19994766375231]
Influence functions approximate the 'influences' of training data-points for test predictions.
We present FastIF, a set of simple modifications to influence functions that significantly improves their run-time.
Our experiments demonstrate the potential of influence functions in model interpretation and correcting model errors.
arXiv Detail & Related papers (2020-12-31T18:02:34Z) - Efficient Estimation of Influence of a Training Instance [56.29080605123304]
We propose an efficient method for estimating the influence of a training instance on a neural network model.
Our method is inspired by dropout, which zero-masks a sub-network and prevents the sub-network from learning each training instance.
We demonstrate that the proposed method can capture training influences, enhance the interpretability of error predictions, and cleanse the training dataset for improving generalization.
arXiv Detail & Related papers (2020-12-08T04:31:38Z) - Multi-Stage Influence Function [97.19210942277354]
We develop a multi-stage influence function score to track predictions from a finetuned model all the way back to the pretraining data.
We study two different scenarios with the pretrained embeddings fixed or updated in the finetuning tasks.
arXiv Detail & Related papers (2020-07-17T16:03:11Z) - Influence Functions in Deep Learning Are Fragile [52.31375893260445]
influence functions approximate the effect of samples in test-time predictions.
influence estimates are fairly accurate for shallow networks.
Hessian regularization is important to get highquality influence estimates.
arXiv Detail & Related papers (2020-06-25T18:25:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.