FastIF: Scalable Influence Functions for Efficient Model Interpretation
and Debugging
- URL: http://arxiv.org/abs/2012.15781v1
- Date: Thu, 31 Dec 2020 18:02:34 GMT
- Title: FastIF: Scalable Influence Functions for Efficient Model Interpretation
and Debugging
- Authors: Han Guo, Nazneen Fatema Rajani, Peter Hase, Mohit Bansal, Caiming
Xiong
- Abstract summary: Influence functions approximate the 'influences' of training data-points for test predictions.
We present FastIF, a set of simple modifications to influence functions that significantly improves their run-time.
Our experiments demonstrate the potential of influence functions in model interpretation and correcting model errors.
- Score: 112.19994766375231
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Influence functions approximate the 'influences' of training data-points for
test predictions and have a wide variety of applications. Despite the
popularity, their computational cost does not scale well with model and
training data size. We present FastIF, a set of simple modifications to
influence functions that significantly improves their run-time. We use
k-Nearest Neighbors (kNN) to narrow the search space down to a subset of good
candidate data points, identify the configurations that best balance the
speed-quality trade-off in estimating the inverse Hessian-vector product, and
introduce a fast parallel variant. Our proposed method achieves about 80x
speedup while being highly correlated with the original influence values. With
the availability of the fast influence functions, we demonstrate their
usefulness in four applications. First, we examine whether influential
data-points can 'explain' test time behavior using the framework of
simulatability. Second, we visualize the influence interactions between
training and test data-points. Third, we show that we can correct model errors
by additional fine-tuning on certain influential data-points, improving the
accuracy of a trained MNLI model by 2.6% on the HANS challenge set using a
small number of gradient updates. Finally, we experiment with a
data-augmentation setup where we use influence functions to search for new
data-points unseen during training to improve model performance. Overall, our
fast influence functions can be efficiently applied to large models and
datasets, and our experiments demonstrate the potential of influence functions
in model interpretation and correcting model errors. Code is available at
https://github.com/salesforce/fast-influence-functions
Related papers
- Do Influence Functions Work on Large Language Models? [10.463762448166714]
Influence functions aim to quantify the impact of individual training data points on a model's predictions.
We evaluate influence functions across multiple tasks and find that they consistently perform poorly in most settings.
arXiv Detail & Related papers (2024-09-30T06:50:18Z) - Efficient Grammatical Error Correction Via Multi-Task Training and
Optimized Training Schedule [55.08778142798106]
We propose auxiliary tasks that exploit the alignment between the original and corrected sentences.
We formulate each task as a sequence-to-sequence problem and perform multi-task training.
We find that the order of datasets used for training and even individual instances within a dataset may have important effects on the final performance.
arXiv Detail & Related papers (2023-11-20T14:50:12Z) - DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and
Diffusion Models [31.65198592956842]
We propose DataInf, an efficient influence approximation method that is practical for large-scale generative AI models.
Our theoretical analysis shows that DataInf is particularly well-suited for parameter-efficient fine-tuning techniques such as LoRA.
In applications to RoBERTa-large, Llama-2-13B-chat, and stable-diffusion-v1.5 models, DataInf effectively identifies the most influential fine-tuning examples better than other approximate influence scores.
arXiv Detail & Related papers (2023-10-02T04:59:19Z) - Stubborn Lexical Bias in Data and Models [50.79738900885665]
We use a new statistical method to examine whether spurious patterns in data appear in models trained on the data.
We apply an optimization approach to *reweight* the training data, reducing thousands of spurious correlations.
Surprisingly, though this method can successfully reduce lexical biases in the training data, we still find strong evidence of corresponding bias in the trained models.
arXiv Detail & Related papers (2023-06-03T20:12:27Z) - GIF: A General Graph Unlearning Strategy via Influence Function [63.52038638220563]
Graph Influence Function (GIF) is a model-agnostic unlearning method that can efficiently and accurately estimate parameter changes in response to a $epsilon$-mass perturbation in deleted data.
We conduct extensive experiments on four representative GNN models and three benchmark datasets to justify GIF's superiority in terms of unlearning efficacy, model utility, and unlearning efficiency.
arXiv Detail & Related papers (2023-04-06T03:02:54Z) - Characterizing the Influence of Graph Elements [24.241010101383505]
The influence function of graph convolution networks (GCNs) can shed light on the effects of removing training nodes/edges from an input graph.
We show that the influence function of an SGC model could be used to estimate the impact of removing training nodes/edges on the test performance of the SGC without re-training the model.
arXiv Detail & Related papers (2022-10-14T01:04:28Z) - If Influence Functions are the Answer, Then What is the Question? [7.873458431535409]
Influence functions efficiently estimate the effect of removing a single training data point on a model's learned parameters.
While influence estimates align well with leave-one-out retraining for linear models, recent works have shown this alignment is often poor in neural networks.
arXiv Detail & Related papers (2022-09-12T16:17:43Z) - Multi-Stage Influence Function [97.19210942277354]
We develop a multi-stage influence function score to track predictions from a finetuned model all the way back to the pretraining data.
We study two different scenarios with the pretrained embeddings fixed or updated in the finetuning tasks.
arXiv Detail & Related papers (2020-07-17T16:03:11Z) - Influence Functions in Deep Learning Are Fragile [52.31375893260445]
influence functions approximate the effect of samples in test-time predictions.
influence estimates are fairly accurate for shallow networks.
Hessian regularization is important to get highquality influence estimates.
arXiv Detail & Related papers (2020-06-25T18:25:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.