Related papers: Evaluating the overall sensitivity of saliency-based explanation methods

Evaluating the overall sensitivity of saliency-based explanation methods

URL: http://arxiv.org/abs/2306.13682v1
Date: Wed, 21 Jun 2023 21:57:58 GMT
Title: Evaluating the overall sensitivity of saliency-based explanation methods
Authors: Harshinee Sriram and Cristina Conati
Abstract summary: We address the need to generate faithful explanations of "black box" Deep Learning models. We select an existing test that is model agnostic and extend it by specifying formal thresh-olds and building criteria. We discuss the relationship between sensitivity and faithfulness and consider how the test can be adapted to assess different explanation methods in other domains.
Score: 1.8655840060559168
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We address the need to generate faithful explanations of "black box" Deep Learning models. Several tests have been proposed to determine aspects of faithfulness of explanation methods, but they lack cross-domain applicability and a rigorous methodology. Hence, we select an existing test that is model agnostic and is well-suited for comparing one aspect of faithfulness (i.e., sensitivity) of multiple explanation methods, and extend it by specifying formal thresh-olds and building criteria to determine the over-all sensitivity of the explanation method. We present examples of how multiple explanation methods for Convolutional Neural Networks can be compared using this extended methodology. Finally, we discuss the relationship between sensitivity and faithfulness and consider how the test can be adapted to assess different explanation methods in other domains.

Related papers

CASE: Contrastive Activation for Saliency Estimation [14.833454650943805]
Saliency methods are widely used to visualize which input features are deemed relevant to a model's prediction.<n>We propose a diagnostic test for class sensitivity: a method's ability to distinguish between competing class labels on the same input.<n>We show that many widely used saliency methods produce nearly identical explanations regardless of the class label, calling into question their reliability.
arXiv Detail & Related papers (2025-06-08T23:57:37Z)
Sanity Checks for Saliency Methods Explaining Object Detectors [5.735035463793008]
Saliency methods are frequently used to explain Deep Neural Network-based models. We perform sanity checks for object detection and define new qualitative criteria to evaluate the saliency explanations. We find that EfficientDet-D0 is the most interpretable method independent of the saliency method.
arXiv Detail & Related papers (2023-06-04T17:57:51Z)
Evaluating the Robustness of Interpretability Methods through Explanation Invariance and Equivariance [72.50214227616728]
Interpretability methods are valuable only if their explanations faithfully describe the explained model. We consider neural networks whose predictions are invariant under a specific symmetry group.
arXiv Detail & Related papers (2023-04-13T17:59:03Z)
On The Coherence of Quantitative Evaluation of Visual Explanations [0.7212939068975619]
Evaluation methods have been proposed to assess the "goodness" of visual explanations. We study a subset of the ImageNet-1k validation set where we evaluate a number of different commonly-used explanation methods. Results of our study suggest that there is a lack of coherency on the grading provided by some of the considered evaluation methods.
arXiv Detail & Related papers (2023-02-14T13:41:57Z)
Discriminative Attribution from Counterfactuals [64.94009515033984]
We present a method for neural network interpretability by combining feature attribution with counterfactual explanations. We show that this method can be used to quantitatively evaluate the performance of feature attribution methods in an objective manner.
arXiv Detail & Related papers (2021-09-28T00:53:34Z)
On Sample Based Explanation Methods for NLP:Efficiency, Faithfulness, and Semantic Evaluation [23.72825603188359]
We can improve the interpretability of explanations by allowing arbitrary text sequences as the explanation unit. We propose a semantic-based evaluation metric that can better align with humans' judgment of explanations.
arXiv Detail & Related papers (2021-06-09T00:49:56Z)
Explaining by Removing: A Unified Framework for Model Explanation [14.50261153230204]
Removal-based explanations are based on the principle of simulating feature removal to quantify each feature's influence. We develop a framework that characterizes each method along three dimensions: 1) how the method removes features, 2) what model behavior the method explains, and 3) how the method summarizes each feature's influence. This newly understood class of explanation methods has rich connections that we examine using tools that have been largely overlooked by the explainability literature.
arXiv Detail & Related papers (2020-11-21T00:47:48Z)
Toward Scalable and Unified Example-based Explanation and Outlier Detection [128.23117182137418]
We argue for a broader adoption of prototype-based student networks capable of providing an example-based explanation for their prediction. We show that our prototype-based networks beyond similarity kernels deliver meaningful explanations and promising outlier detection results without compromising classification accuracy.
arXiv Detail & Related papers (2020-11-11T05:58:17Z)
A Diagnostic Study of Explainability Techniques for Text Classification [52.879658637466605]
We develop a list of diagnostic properties for evaluating existing explainability techniques. We compare the saliency scores assigned by the explainability techniques with human annotations of salient input regions to find relations between a model's performance and the agreement of its rationales with human ones.
arXiv Detail & Related papers (2020-09-25T12:01:53Z)
Evaluations and Methods for Explanation through Robustness Analysis [117.7235152610957]
We establish a novel set of evaluation criteria for such feature based explanations by analysis. We obtain new explanations that are loosely necessary and sufficient for a prediction. We extend the explanation to extract the set of features that would move the current prediction to a target class.
arXiv Detail & Related papers (2020-05-31T05:52:05Z)
There and Back Again: Revisiting Backpropagation Saliency Methods [87.40330595283969]
Saliency methods seek to explain the predictions of a model by producing an importance map across each input sample. A popular class of such methods is based on backpropagating a signal and analyzing the resulting gradient. We propose a single framework under which several such methods can be unified.
arXiv Detail & Related papers (2020-04-06T17:58:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.