Theoretical and Practical Perspectives on what Influence Functions Do
- URL: http://arxiv.org/abs/2305.16971v1
- Date: Fri, 26 May 2023 14:26:36 GMT
- Title: Theoretical and Practical Perspectives on what Influence Functions Do
- Authors: Andrea Schioppa and Katja Filippova and Ivan Titov and Polina
Zablotskaia
- Abstract summary: Influence functions (IF) have been seen as a technique for explaining model predictions through the lens of the training data.
Recent empirical studies have shown that the existing methods of estimating IF predict the leave-one-out-and-retrain effect poorly.
We show that while most assumptions can be addressed successfully, the parameter divergence poses a clear limitation on the predictive power of IF.
- Score: 45.35457212616306
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Influence functions (IF) have been seen as a technique for explaining model
predictions through the lens of the training data. Their utility is assumed to
be in identifying training examples "responsible" for a prediction so that, for
example, correcting a prediction is possible by intervening on those examples
(removing or editing them) and retraining the model. However, recent empirical
studies have shown that the existing methods of estimating IF predict the
leave-one-out-and-retrain effect poorly.
In order to understand the mismatch between the theoretical promise and the
practical results, we analyse five assumptions made by IF methods which are
problematic for modern-scale deep neural networks and which concern convexity,
numeric stability, training trajectory and parameter divergence. This allows us
to clarify what can be expected theoretically from IF. We show that while most
assumptions can be addressed successfully, the parameter divergence poses a
clear limitation on the predictive power of IF: influence fades over training
time even with deterministic training. We illustrate this theoretical result
with BERT and ResNet models.
Another conclusion from the theoretical analysis is that IF are still useful
for model debugging and correcting even though some of the assumptions made in
prior work do not hold: using natural language processing and computer vision
tasks, we verify that mis-predictions can be successfully corrected by taking
only a few fine-tuning steps on influential examples.
Related papers
- Towards Characterizing Domain Counterfactuals For Invertible Latent Causal Models [15.817239008727789]
In this work, we analyze a specific type of causal query called domain counterfactuals, which hypothesizes what a sample would have looked like if it had been generated in a different domain.
We show that recovering the latent Structural Causal Model (SCM) is unnecessary for estimating domain counterfactuals.
We also develop a theoretically grounded practical algorithm that simplifies the modeling process to generative model estimation.
arXiv Detail & Related papers (2023-06-20T04:19:06Z) - Feature Perturbation Augmentation for Reliable Evaluation of Importance
Estimators in Neural Networks [5.439020425819001]
Post-hoc interpretability methods attempt to make the inner workings of deep neural networks more interpretable.
One of the most popular evaluation frameworks is to perturb features deemed important by an interpretability method.
We propose feature perturbation augmentation (FPA) which creates and adds perturbed images during the model training.
arXiv Detail & Related papers (2023-03-02T19:05:46Z) - Modeling Uncertain Feature Representation for Domain Generalization [49.129544670700525]
We show that our method consistently improves the network generalization ability on multiple vision tasks.
Our methods are simple yet effective and can be readily integrated into networks without additional trainable parameters or loss constraints.
arXiv Detail & Related papers (2023-01-16T14:25:02Z) - Pathologies of Pre-trained Language Models in Few-shot Fine-tuning [50.3686606679048]
We show that pre-trained language models with few examples show strong prediction bias across labels.
Although few-shot fine-tuning can mitigate the prediction bias, our analysis shows models gain performance improvement by capturing non-task-related features.
These observations alert that pursuing model performance with fewer examples may incur pathological prediction behavior.
arXiv Detail & Related papers (2022-04-17T15:55:18Z) - Instance-Based Neural Dependency Parsing [56.63500180843504]
We develop neural models that possess an interpretable inference process for dependency parsing.
Our models adopt instance-based inference, where dependency edges are extracted and labeled by comparing them to edges in a training set.
arXiv Detail & Related papers (2021-09-28T05:30:52Z) - Hessian-based toolbox for reliable and interpretable machine learning in
physics [58.720142291102135]
We present a toolbox for interpretability and reliability, extrapolation of the model architecture.
It provides a notion of the influence of the input data on the prediction at a given test point, an estimation of the uncertainty of the model predictions, and an agnostic score for the model predictions.
Our work opens the road to the systematic use of interpretability and reliability methods in ML applied to physics and, more generally, science.
arXiv Detail & Related papers (2021-08-04T16:32:59Z) - Trust but Verify: Assigning Prediction Credibility by Counterfactual
Constrained Learning [123.3472310767721]
Prediction credibility measures are fundamental in statistics and machine learning.
These measures should account for the wide variety of models used in practice.
The framework developed in this work expresses the credibility as a risk-fit trade-off.
arXiv Detail & Related papers (2020-11-24T19:52:38Z) - A comprehensive study on the prediction reliability of graph neural
networks for virtual screening [0.0]
We investigate the effects of model architectures, regularization methods, and loss functions on the prediction performance and reliability of classification results.
Our result highlights that correct choice of regularization and inference methods is evidently important to achieve high success rate.
arXiv Detail & Related papers (2020-03-17T10:13:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.