Related papers: Inf-CP: A Reliable Channel Pruning based on Channel Influence

Inf-CP: A Reliable Channel Pruning based on Channel Influence

URL: http://arxiv.org/abs/2112.02521v1
Date: Sun, 5 Dec 2021 09:30:43 GMT
Title: Inf-CP: A Reliable Channel Pruning based on Channel Influence
Authors: Bilan Lai, Haoran Xiang, Furao Shen
Abstract summary: One of the most effective methods of channel pruning is to trim on the basis of the importance of each neuron. Previous works have proposed to trim by considering the statistics of a single layer or a plurality of successive layers of neurons. We propose to use ensemble learning to train a model for different batches of data.
Score: 4.692400531340393
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: One of the most effective methods of channel pruning is to trim on the basis of the importance of each neuron. However, measuring the importance of each neuron is an NP-hard problem. Previous works have proposed to trim by considering the statistics of a single layer or a plurality of successive layers of neurons. These works cannot eliminate the influence of different data on the model in the reconstruction error, and currently, there is no work to prove that the absolute values of the parameters can be directly used as the basis for judging the importance of the weights. A more reasonable approach is to eliminate the difference between batch data that accurately measures the weight of influence. In this paper, we propose to use ensemble learning to train a model for different batches of data and use the influence function (a classic technique from robust statistics) to learn the algorithm to track the model's prediction and return its training parameter gradient, so that we can determine the responsibility for each parameter, which we call "influence", in the prediction process. In addition, we theoretically prove that the back-propagation of the deep network is a first-order Taylor approximation of the influence function of the weights. We perform extensive experiments to prove that pruning based on the influence function using the idea of ensemble learning will be much more effective than just focusing on error reconstruction. Experiments on CIFAR shows that the influence pruning achieves the state-of-the-art result.

Related papers

Do-PFN: In-Context Learning for Causal Effect Estimation [75.62771416172109]
We show that Prior-data fitted networks (PFNs) can be pre-trained on synthetic data to predict outcomes.<n>Our approach allows for the accurate estimation of causal effects without knowledge of the underlying causal graph.
arXiv Detail & Related papers (2025-06-06T12:43:57Z)
Data Valuation using Neural Networks for Efficient Instruction Fine-Tuning [11.153153731598634]
Influence functions provide crucial insights into model training. Existing methods suffer from large computational costs and limited generalization. In this paper, we explore the use of small neural networks to estimate influence values, achieving up to 99% cost reduction.
arXiv Detail & Related papers (2025-02-14T07:55:47Z)
Capturing the Temporal Dependence of Training Data Influence [100.91355498124527]
We formalize the concept of trajectory-specific leave-one-out influence, which quantifies the impact of removing a data point during training. We propose data value embedding, a novel technique enabling efficient approximation of trajectory-specific LOO. As data value embedding captures training data ordering, it offers valuable insights into model training dynamics.
arXiv Detail & Related papers (2024-12-12T18:28:55Z)
Enhancing Training Data Attribution for Large Language Models with Fitting Error Consideration [74.09687562334682]
We introduce a novel training data attribution method called Debias and Denoise Attribution (DDA) Our method significantly outperforms existing approaches, achieving an averaged AUC of 91.64%. DDA exhibits strong generality and scalability across various sources and different-scale models like LLaMA2, QWEN2, and Mistral.
arXiv Detail & Related papers (2024-10-02T07:14:26Z)
Characterizing the Influence of Graph Elements [24.241010101383505]
The influence function of graph convolution networks (GCNs) can shed light on the effects of removing training nodes/edges from an input graph. We show that the influence function of an SGC model could be used to estimate the impact of removing training nodes/edges on the test performance of the SGC without re-training the model.
arXiv Detail & Related papers (2022-10-14T01:04:28Z)
If Influence Functions are the Answer, Then What is the Question? [7.873458431535409]
Influence functions efficiently estimate the effect of removing a single training data point on a model's learned parameters. While influence estimates align well with leave-one-out retraining for linear models, recent works have shown this alignment is often poor in neural networks.
arXiv Detail & Related papers (2022-09-12T16:17:43Z)
Causal Effect Estimation using Variational Information Bottleneck [19.6760527269791]
Causal inference is to estimate the causal effect in a causal relationship when intervention is applied. We propose a method to estimate Causal Effect by using Variational Information Bottleneck (CEVIB)
arXiv Detail & Related papers (2021-10-26T13:46:12Z)
Causal Inference Under Unmeasured Confounding With Negative Controls: A Minimax Learning Approach [84.29777236590674]
We study the estimation of causal parameters when not all confounders are observed and instead negative controls are available. Recent work has shown how these can enable identification and efficient estimation via two so-called bridge functions.
arXiv Detail & Related papers (2021-03-25T17:59:19Z)
FastIF: Scalable Influence Functions for Efficient Model Interpretation and Debugging [112.19994766375231]
Influence functions approximate the 'influences' of training data-points for test predictions. We present FastIF, a set of simple modifications to influence functions that significantly improves their run-time. Our experiments demonstrate the potential of influence functions in model interpretation and correcting model errors.
arXiv Detail & Related papers (2020-12-31T18:02:34Z)
Efficient Estimation of Influence of a Training Instance [56.29080605123304]
We propose an efficient method for estimating the influence of a training instance on a neural network model. Our method is inspired by dropout, which zero-masks a sub-network and prevents the sub-network from learning each training instance. We demonstrate that the proposed method can capture training influences, enhance the interpretability of error predictions, and cleanse the training dataset for improving generalization.
arXiv Detail & Related papers (2020-12-08T04:31:38Z)
Multi-Stage Influence Function [97.19210942277354]
We develop a multi-stage influence function score to track predictions from a finetuned model all the way back to the pretraining data. We study two different scenarios with the pretrained embeddings fixed or updated in the finetuning tasks.
arXiv Detail & Related papers (2020-07-17T16:03:11Z)
Influence Functions in Deep Learning Are Fragile [52.31375893260445]
influence functions approximate the effect of samples in test-time predictions. influence estimates are fairly accurate for shallow networks. Hessian regularization is important to get highquality influence estimates.
arXiv Detail & Related papers (2020-06-25T18:25:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.