Related papers: Feature Perturbation Augmentation for Reliable Evaluation of Importance Estimators in Neural Networks

Feature Perturbation Augmentation for Reliable Evaluation of Importance Estimators in Neural Networks

URL: http://arxiv.org/abs/2303.01538v2
Date: Thu, 23 Nov 2023 08:50:37 GMT
Title: Feature Perturbation Augmentation for Reliable Evaluation of Importance Estimators in Neural Networks
Authors: Lennart Brocki and Neo Christopher Chung
Abstract summary: Post-hoc interpretability methods attempt to make the inner workings of deep neural networks more interpretable. One of the most popular evaluation frameworks is to perturb features deemed important by an interpretability method. We propose feature perturbation augmentation (FPA) which creates and adds perturbed images during the model training.
Score: 5.439020425819001
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Post-hoc explanation methods attempt to make the inner workings of deep neural networks more interpretable. However, since a ground truth is in general lacking, local post-hoc interpretability methods, which assign importance scores to input features, are challenging to evaluate. One of the most popular evaluation frameworks is to perturb features deemed important by an interpretability method and to measure the change in prediction accuracy. Intuitively, a large decrease in prediction accuracy would indicate that the explanation has correctly quantified the importance of features with respect to the prediction outcome (e.g., logits). However, the change in the prediction outcome may stem from perturbation artifacts, since perturbed samples in the test dataset are out of distribution (OOD) compared to the training dataset and can therefore potentially disturb the model in an unexpected manner. To overcome this challenge, we propose feature perturbation augmentation (FPA) which creates and adds perturbed images during the model training. Through extensive computational experiments, we demonstrate that FPA makes deep neural networks (DNNs) more robust against perturbations. Furthermore, training DNNs with FPA demonstrate that the sign of importance scores may explain the model more meaningfully than has previously been assumed. Overall, FPA is an intuitive data augmentation technique that improves the evaluation of post-hoc interpretability methods.

Related papers

Quantification of Uncertainties in Probabilistic Deep Neural Network by Implementing Boosting of Variational Inference [0.38366697175402226]
Boosted Bayesian Neural Networks (BBNN) is a novel approach that enhances neural network weight distribution approximations. BBNN achieves 5% higher accuracy compared to conventional neural networks.
arXiv Detail & Related papers (2025-03-18T05:11:21Z)
Deep Neural Networks Tend To Extrapolate Predictably [51.303814412294514]
neural network predictions tend to be unpredictable and overconfident when faced with out-of-distribution (OOD) inputs. We observe that neural network predictions often tend towards a constant value as input data becomes increasingly OOD. We show how one can leverage our insights in practice to enable risk-sensitive decision-making in the presence of OOD inputs.
arXiv Detail & Related papers (2023-10-02T03:25:32Z)
Uncertainty Estimation by Fisher Information-based Evidential Deep Learning [61.94125052118442]
Uncertainty estimation is a key factor that makes deep learning reliable in practical applications. We propose a novel method, Fisher Information-based Evidential Deep Learning ($mathcalI$-EDL) In particular, we introduce Fisher Information Matrix (FIM) to measure the informativeness of evidence carried by each sample, according to which we can dynamically reweight the objective loss terms to make the network more focused on the representation learning of uncertain classes.
arXiv Detail & Related papers (2023-03-03T16:12:59Z)
Toward Robust Uncertainty Estimation with Random Activation Functions [3.0586855806896045]
We propose a novel approach for uncertainty quantification via ensembles, called Random Activation Functions (RAFs) Ensemble. RAFs Ensemble outperforms state-of-the-art ensemble uncertainty quantification methods on both synthetic and real-world datasets.
arXiv Detail & Related papers (2023-02-28T13:17:56Z)
Modeling Uncertain Feature Representation for Domain Generalization [49.129544670700525]
We show that our method consistently improves the network generalization ability on multiple vision tasks. Our methods are simple yet effective and can be readily integrated into networks without additional trainable parameters or loss constraints.
arXiv Detail & Related papers (2023-01-16T14:25:02Z)
$p$-DkNN: Out-of-Distribution Detection Through Statistical Testing of Deep Representations [32.99800144249333]
We introduce $p$-DkNN, a novel inference procedure that takes a trained deep neural network and analyzes the similarity structures of its intermediate hidden representations. We find that $p$-DkNN forces adaptive attackers crafting adversarial examples, a form of worst-case OOD inputs, to introduce semantically meaningful changes to the inputs.
arXiv Detail & Related papers (2022-07-25T21:42:08Z)
Fidelity of Interpretability Methods and Perturbation Artifacts in Neural Networks [5.439020425819001]
Post-hoc interpretability methods aim to quantify the importance of input features with respect to the class probabilities. A popular approach to evaluate interpretability methods is to perturb input features deemed important for a given prediction and observe the decrease in accuracy. We propose a method for estimating the impact of such artifacts on the fidelity estimation by utilizing model accuracy curves from perturbing input features.
arXiv Detail & Related papers (2022-03-06T10:14:09Z)
Deconfounding to Explanation Evaluation in Graph Neural Networks [136.73451468551656]
We argue that a distribution shift exists between the full graph and the subgraph, causing the out-of-distribution problem. We propose Deconfounded Subgraph Evaluation (DSE) which assesses the causal effect of an explanatory subgraph on the model prediction.
arXiv Detail & Related papers (2022-01-21T18:05:00Z)
Perturbed and Strict Mean Teachers for Semi-supervised Semantic Segmentation [22.5935068122522]
In this paper, we address the prediction accuracy problem of consistency learning methods with novel extensions of the mean-teacher (MT) model. The accurate prediction by this model allows us to use a challenging combination of network, input data and feature perturbations to improve the consistency learning generalisation. Results on public benchmarks show that our approach achieves remarkable improvements over the previous SOTA methods in the field.
arXiv Detail & Related papers (2021-11-25T04:30:56Z)
Accurate and Robust Feature Importance Estimation under Distribution Shifts [49.58991359544005]
PRoFILE is a novel feature importance estimation method. We show significant improvements over state-of-the-art approaches, both in terms of fidelity and robustness.
arXiv Detail & Related papers (2020-09-30T05:29:01Z)
Ramifications of Approximate Posterior Inference for Bayesian Deep Learning in Adversarial and Out-of-Distribution Settings [7.476901945542385]
We show that Bayesian deep learning models on certain occasions marginally outperform conventional neural networks. Preliminary investigations indicate the potential inherent role of bias due to choices of initialisation, architecture or activation functions.
arXiv Detail & Related papers (2020-09-03T16:58:15Z)
Unlabelled Data Improves Bayesian Uncertainty Calibration under Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation. We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.