Investigating sanity checks for saliency maps with image and text
classification
- URL: http://arxiv.org/abs/2106.07475v1
- Date: Tue, 8 Jun 2021 23:23:42 GMT
- Title: Investigating sanity checks for saliency maps with image and text
classification
- Authors: Narine Kokhlikyan, Vivek Miglani, Bilal Alsallakh, Miguel Martin and
Orion Reblitz-Richardson
- Abstract summary: Saliency maps have shown to be both useful and misleading for explaining model predictions especially in the context of images.
We analyze the effects of the input multiplier in certain saliency maps using similarity scores, max-sensitivity and infidelity evaluation metrics.
- Score: 1.836681984330549
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Saliency maps have shown to be both useful and misleading for explaining
model predictions especially in the context of images. In this paper, we
perform sanity checks for text modality and show that the conclusions made for
image do not directly transfer to text. We also analyze the effects of the
input multiplier in certain saliency maps using similarity scores,
max-sensitivity and infidelity evaluation metrics. Our observations reveal that
the input multiplier carries input's structural patterns in explanation maps,
thus leading to similar results regardless of the choice of model parameters.
We also show that the smoothness of a Neural Network (NN) function can affect
the quality of saliency-based explanations. Our investigations reveal that
replacing ReLUs with Softplus and MaxPool with smoother variants such as
LogSumExp (LSE) can lead to explanations that are more reliable based on the
infidelity evaluation metric.
Related papers
- Improving Network Interpretability via Explanation Consistency Evaluation [56.14036428778861]
We propose a framework that acquires more explainable activation heatmaps and simultaneously increase the model performance.
Specifically, our framework introduces a new metric, i.e., explanation consistency, to reweight the training samples adaptively in model learning.
Our framework then promotes the model learning by paying closer attention to those training samples with a high difference in explanations.
arXiv Detail & Related papers (2024-08-08T17:20:08Z) - SCAAT: Improving Neural Network Interpretability via Saliency
Constrained Adaptive Adversarial Training [10.716021768803433]
Saliency map is a common form of explanation illustrating the heatmap of feature attributions.
We propose a model-agnostic learning method called Saliency Constrained Adaptive Adversarial Training (SCAAT) to improve the quality of such DNN interpretability.
arXiv Detail & Related papers (2023-11-09T04:48:38Z) - Causal Transportability for Visual Recognition [70.13627281087325]
We show that standard classifiers fail because the association between images and labels is not transportable across settings.
We then show that the causal effect, which severs all sources of confounding, remains invariant across domains.
This motivates us to develop an algorithm to estimate the causal effect for image classification.
arXiv Detail & Related papers (2022-04-26T15:02:11Z) - ADVISE: ADaptive Feature Relevance and VISual Explanations for
Convolutional Neural Networks [0.745554610293091]
We introduce ADVISE, a new explainability method that quantifies and leverages the relevance of each unit of the feature map to provide better visual explanations.
We extensively evaluate our idea in the image classification task using AlexNet, VGG16, ResNet50, and Xception pretrained on ImageNet.
Our experiments further show that ADVISE fulfils the sensitivity and implementation independence axioms while passing the sanity checks.
arXiv Detail & Related papers (2022-03-02T18:16:57Z) - Smoothed Embeddings for Certified Few-Shot Learning [63.68667303948808]
We extend randomized smoothing to few-shot learning models that map inputs to normalized embeddings.
Our results are confirmed by experiments on different datasets.
arXiv Detail & Related papers (2022-02-02T18:19:04Z) - Deconfounding to Explanation Evaluation in Graph Neural Networks [136.73451468551656]
We argue that a distribution shift exists between the full graph and the subgraph, causing the out-of-distribution problem.
We propose Deconfounded Subgraph Evaluation (DSE) which assesses the causal effect of an explanatory subgraph on the model prediction.
arXiv Detail & Related papers (2022-01-21T18:05:00Z) - Where do Models go Wrong? Parameter-Space Saliency Maps for
Explainability [47.18202269163001]
We take a different approach to saliency, in which we identify and analyze the network parameters, rather than inputs.
We find that samples which cause similar parameters to malfunction are semantically similar.
We also show that pruning the most salient parameters for a wrongly classified sample often improves model behavior.
arXiv Detail & Related papers (2021-08-03T07:32:34Z) - CAMERAS: Enhanced Resolution And Sanity preserving Class Activation
Mapping for image saliency [61.40511574314069]
Backpropagation image saliency aims at explaining model predictions by estimating model-centric importance of individual pixels in the input.
We propose CAMERAS, a technique to compute high-fidelity backpropagation saliency maps without requiring any external priors.
arXiv Detail & Related papers (2021-06-20T08:20:56Z) - Evaluating Input Perturbation Methods for Interpreting CNNs and Saliency
Map Comparison [9.023847175654602]
In this paper we show that arguably neutral baseline images still impact the generated saliency maps and their evaluation with input perturbations.
We experimentally reveal inconsistencies among a selection of input perturbation methods and find that they lack robustness for generating saliency maps and for evaluating saliency maps as saliency metrics.
arXiv Detail & Related papers (2021-01-26T18:11:06Z) - Explaining and Improving Model Behavior with k Nearest Neighbor
Representations [107.24850861390196]
We propose using k nearest neighbor representations to identify training examples responsible for a model's predictions.
We show that kNN representations are effective at uncovering learned spurious associations.
Our results indicate that the kNN approach makes the finetuned model more robust to adversarial inputs.
arXiv Detail & Related papers (2020-10-18T16:55:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.