Feature Perturbation Augmentation for Reliable Evaluation of Importance
Estimators in Neural Networks
- URL: http://arxiv.org/abs/2303.01538v2
- Date: Thu, 23 Nov 2023 08:50:37 GMT
- Title: Feature Perturbation Augmentation for Reliable Evaluation of Importance
Estimators in Neural Networks
- Authors: Lennart Brocki and Neo Christopher Chung
- Abstract summary: Post-hoc interpretability methods attempt to make the inner workings of deep neural networks more interpretable.
One of the most popular evaluation frameworks is to perturb features deemed important by an interpretability method.
We propose feature perturbation augmentation (FPA) which creates and adds perturbed images during the model training.
- Score: 5.439020425819001
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Post-hoc explanation methods attempt to make the inner workings of deep
neural networks more interpretable. However, since a ground truth is in general
lacking, local post-hoc interpretability methods, which assign importance
scores to input features, are challenging to evaluate. One of the most popular
evaluation frameworks is to perturb features deemed important by an
interpretability method and to measure the change in prediction accuracy.
Intuitively, a large decrease in prediction accuracy would indicate that the
explanation has correctly quantified the importance of features with respect to
the prediction outcome (e.g., logits). However, the change in the prediction
outcome may stem from perturbation artifacts, since perturbed samples in the
test dataset are out of distribution (OOD) compared to the training dataset and
can therefore potentially disturb the model in an unexpected manner. To
overcome this challenge, we propose feature perturbation augmentation (FPA)
which creates and adds perturbed images during the model training. Through
extensive computational experiments, we demonstrate that FPA makes deep neural
networks (DNNs) more robust against perturbations. Furthermore, training DNNs
with FPA demonstrate that the sign of importance scores may explain the model
more meaningfully than has previously been assumed. Overall, FPA is an
intuitive data augmentation technique that improves the evaluation of post-hoc
interpretability methods.
Related papers
- Deep Neural Networks Tend To Extrapolate Predictably [51.303814412294514]
neural network predictions tend to be unpredictable and overconfident when faced with out-of-distribution (OOD) inputs.
We observe that neural network predictions often tend towards a constant value as input data becomes increasingly OOD.
We show how one can leverage our insights in practice to enable risk-sensitive decision-making in the presence of OOD inputs.
arXiv Detail & Related papers (2023-10-02T03:25:32Z) - Uncertainty Estimation by Fisher Information-based Evidential Deep
Learning [61.94125052118442]
Uncertainty estimation is a key factor that makes deep learning reliable in practical applications.
We propose a novel method, Fisher Information-based Evidential Deep Learning ($mathcalI$-EDL)
In particular, we introduce Fisher Information Matrix (FIM) to measure the informativeness of evidence carried by each sample, according to which we can dynamically reweight the objective loss terms to make the network more focused on the representation learning of uncertain classes.
arXiv Detail & Related papers (2023-03-03T16:12:59Z) - Toward Robust Uncertainty Estimation with Random Activation Functions [3.0586855806896045]
We propose a novel approach for uncertainty quantification via ensembles, called Random Activation Functions (RAFs) Ensemble.
RAFs Ensemble outperforms state-of-the-art ensemble uncertainty quantification methods on both synthetic and real-world datasets.
arXiv Detail & Related papers (2023-02-28T13:17:56Z) - Modeling Uncertain Feature Representation for Domain Generalization [49.129544670700525]
We show that our method consistently improves the network generalization ability on multiple vision tasks.
Our methods are simple yet effective and can be readily integrated into networks without additional trainable parameters or loss constraints.
arXiv Detail & Related papers (2023-01-16T14:25:02Z) - $p$-DkNN: Out-of-Distribution Detection Through Statistical Testing of
Deep Representations [32.99800144249333]
We introduce $p$-DkNN, a novel inference procedure that takes a trained deep neural network and analyzes the similarity structures of its intermediate hidden representations.
We find that $p$-DkNN forces adaptive attackers crafting adversarial examples, a form of worst-case OOD inputs, to introduce semantically meaningful changes to the inputs.
arXiv Detail & Related papers (2022-07-25T21:42:08Z) - Fidelity of Interpretability Methods and Perturbation Artifacts in
Neural Networks [5.439020425819001]
Post-hoc interpretability methods aim to quantify the importance of input features with respect to the class probabilities.
A popular approach to evaluate interpretability methods is to perturb input features deemed important for a given prediction and observe the decrease in accuracy.
We propose a method for estimating the impact of such artifacts on the fidelity estimation by utilizing model accuracy curves from perturbing input features.
arXiv Detail & Related papers (2022-03-06T10:14:09Z) - Deconfounding to Explanation Evaluation in Graph Neural Networks [136.73451468551656]
We argue that a distribution shift exists between the full graph and the subgraph, causing the out-of-distribution problem.
We propose Deconfounded Subgraph Evaluation (DSE) which assesses the causal effect of an explanatory subgraph on the model prediction.
arXiv Detail & Related papers (2022-01-21T18:05:00Z) - Perturbed and Strict Mean Teachers for Semi-supervised Semantic
Segmentation [22.5935068122522]
In this paper, we address the prediction accuracy problem of consistency learning methods with novel extensions of the mean-teacher (MT) model.
The accurate prediction by this model allows us to use a challenging combination of network, input data and feature perturbations to improve the consistency learning generalisation.
Results on public benchmarks show that our approach achieves remarkable improvements over the previous SOTA methods in the field.
arXiv Detail & Related papers (2021-11-25T04:30:56Z) - Accurate and Robust Feature Importance Estimation under Distribution
Shifts [49.58991359544005]
PRoFILE is a novel feature importance estimation method.
We show significant improvements over state-of-the-art approaches, both in terms of fidelity and robustness.
arXiv Detail & Related papers (2020-09-30T05:29:01Z) - Ramifications of Approximate Posterior Inference for Bayesian Deep
Learning in Adversarial and Out-of-Distribution Settings [7.476901945542385]
We show that Bayesian deep learning models on certain occasions marginally outperform conventional neural networks.
Preliminary investigations indicate the potential inherent role of bias due to choices of initialisation, architecture or activation functions.
arXiv Detail & Related papers (2020-09-03T16:58:15Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.