Related papers: Investigating sanity checks for saliency maps with image and text classification

Investigating sanity checks for saliency maps with image and text classification

URL: http://arxiv.org/abs/2106.07475v1
Date: Tue, 8 Jun 2021 23:23:42 GMT
Title: Investigating sanity checks for saliency maps with image and text classification
Authors: Narine Kokhlikyan, Vivek Miglani, Bilal Alsallakh, Miguel Martin and Orion Reblitz-Richardson
Abstract summary: Saliency maps have shown to be both useful and misleading for explaining model predictions especially in the context of images. We analyze the effects of the input multiplier in certain saliency maps using similarity scores, max-sensitivity and infidelity evaluation metrics.
Score: 1.836681984330549
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Saliency maps have shown to be both useful and misleading for explaining model predictions especially in the context of images. In this paper, we perform sanity checks for text modality and show that the conclusions made for image do not directly transfer to text. We also analyze the effects of the input multiplier in certain saliency maps using similarity scores, max-sensitivity and infidelity evaluation metrics. Our observations reveal that the input multiplier carries input's structural patterns in explanation maps, thus leading to similar results regardless of the choice of model parameters. We also show that the smoothness of a Neural Network (NN) function can affect the quality of saliency-based explanations. Our investigations reveal that replacing ReLUs with Softplus and MaxPool with smoother variants such as LogSumExp (LSE) can lead to explanations that are more reliable based on the infidelity evaluation metric.

Related papers

Aggregating Local Saliency Maps for Semi-Global Explainable Image Classification [0.0]
Deep learning dominates image classification tasks, yet understanding how models arrive at predictions remains a challenge.<n>Much research focuses on local explanations of individual predictions, such as saliency maps, which visualise the influence of specific pixels on a model's prediction.<n>We propose Segment Attribution Tables (SATs), a method for summarising local saliency explanations into (semi-)global insights.
arXiv Detail & Related papers (2025-06-29T14:11:02Z)
Sampling Matters in Explanations: Towards Trustworthy Attribution Analysis Building Block in Visual Models through Maximizing Explanation Certainty [0.0]
Building trustworthy attribution analysis needs to settle the sample distribution misalignment problem.<n>We present a semi-optimal sampling approach by suppressing features from inputs.<n>Our approach is effective and able to yield more satisfactory explanations against state-of-the-art baselines.
arXiv Detail & Related papers (2025-06-24T09:15:22Z)
Improving Network Interpretability via Explanation Consistency Evaluation [56.14036428778861]
We propose a framework that acquires more explainable activation heatmaps and simultaneously increase the model performance. Specifically, our framework introduces a new metric, i.e., explanation consistency, to reweight the training samples adaptively in model learning. Our framework then promotes the model learning by paying closer attention to those training samples with a high difference in explanations.
arXiv Detail & Related papers (2024-08-08T17:20:08Z)
SCAAT: Improving Neural Network Interpretability via Saliency Constrained Adaptive Adversarial Training [10.716021768803433]
Saliency map is a common form of explanation illustrating the heatmap of feature attributions. We propose a model-agnostic learning method called Saliency Constrained Adaptive Adversarial Training (SCAAT) to improve the quality of such DNN interpretability.
arXiv Detail & Related papers (2023-11-09T04:48:38Z)
Causal Transportability for Visual Recognition [70.13627281087325]
We show that standard classifiers fail because the association between images and labels is not transportable across settings. We then show that the causal effect, which severs all sources of confounding, remains invariant across domains. This motivates us to develop an algorithm to estimate the causal effect for image classification.
arXiv Detail & Related papers (2022-04-26T15:02:11Z)
ADVISE: ADaptive Feature Relevance and VISual Explanations for Convolutional Neural Networks [0.745554610293091]
We introduce ADVISE, a new explainability method that quantifies and leverages the relevance of each unit of the feature map to provide better visual explanations. We extensively evaluate our idea in the image classification task using AlexNet, VGG16, ResNet50, and Xception pretrained on ImageNet. Our experiments further show that ADVISE fulfils the sensitivity and implementation independence axioms while passing the sanity checks.
arXiv Detail & Related papers (2022-03-02T18:16:57Z)
Smoothed Embeddings for Certified Few-Shot Learning [63.68667303948808]
We extend randomized smoothing to few-shot learning models that map inputs to normalized embeddings. Our results are confirmed by experiments on different datasets.
arXiv Detail & Related papers (2022-02-02T18:19:04Z)
Deconfounding to Explanation Evaluation in Graph Neural Networks [136.73451468551656]
We argue that a distribution shift exists between the full graph and the subgraph, causing the out-of-distribution problem. We propose Deconfounded Subgraph Evaluation (DSE) which assesses the causal effect of an explanatory subgraph on the model prediction.
arXiv Detail & Related papers (2022-01-21T18:05:00Z)
Where do Models go Wrong? Parameter-Space Saliency Maps for Explainability [47.18202269163001]
We take a different approach to saliency, in which we identify and analyze the network parameters, rather than inputs. We find that samples which cause similar parameters to malfunction are semantically similar. We also show that pruning the most salient parameters for a wrongly classified sample often improves model behavior.
arXiv Detail & Related papers (2021-08-03T07:32:34Z)
CAMERAS: Enhanced Resolution And Sanity preserving Class Activation Mapping for image saliency [61.40511574314069]
Backpropagation image saliency aims at explaining model predictions by estimating model-centric importance of individual pixels in the input. We propose CAMERAS, a technique to compute high-fidelity backpropagation saliency maps without requiring any external priors.
arXiv Detail & Related papers (2021-06-20T08:20:56Z)
Evaluating Input Perturbation Methods for Interpreting CNNs and Saliency Map Comparison [9.023847175654602]
In this paper we show that arguably neutral baseline images still impact the generated saliency maps and their evaluation with input perturbations. We experimentally reveal inconsistencies among a selection of input perturbation methods and find that they lack robustness for generating saliency maps and for evaluating saliency maps as saliency metrics.
arXiv Detail & Related papers (2021-01-26T18:11:06Z)
Explaining and Improving Model Behavior with k Nearest Neighbor Representations [107.24850861390196]
We propose using k nearest neighbor representations to identify training examples responsible for a model's predictions. We show that kNN representations are effective at uncovering learned spurious associations. Our results indicate that the kNN approach makes the finetuned model more robust to adversarial inputs.
arXiv Detail & Related papers (2020-10-18T16:55:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.