Evaluations and Methods for Explanation through Robustness Analysis
- URL: http://arxiv.org/abs/2006.00442v2
- Date: Thu, 8 Apr 2021 21:18:01 GMT
- Title: Evaluations and Methods for Explanation through Robustness Analysis
- Authors: Cheng-Yu Hsieh, Chih-Kuan Yeh, Xuanqing Liu, Pradeep Ravikumar,
Seungyeon Kim, Sanjiv Kumar, Cho-Jui Hsieh
- Abstract summary: We establish a novel set of evaluation criteria for such feature based explanations by analysis.
We obtain new explanations that are loosely necessary and sufficient for a prediction.
We extend the explanation to extract the set of features that would move the current prediction to a target class.
- Score: 117.7235152610957
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Feature based explanations, that provide importance of each feature towards
the model prediction, is arguably one of the most intuitive ways to explain a
model. In this paper, we establish a novel set of evaluation criteria for such
feature based explanations by robustness analysis. In contrast to existing
evaluations which require us to specify some way to "remove" features that
could inevitably introduces biases and artifacts, we make use of the subtler
notion of smaller adversarial perturbations. By optimizing towards our proposed
evaluation criteria, we obtain new explanations that are loosely necessary and
sufficient for a prediction. We further extend the explanation to extract the
set of features that would move the current prediction to a target class by
adopting targeted adversarial attack for the robustness analysis. Through
experiments across multiple domains and a user study, we validate the
usefulness of our evaluation criteria and our derived explanations.
Related papers
- Evaluating the Utility of Model Explanations for Model Development [54.23538543168767]
We evaluate whether explanations can improve human decision-making in practical scenarios of machine learning model development.
To our surprise, we did not find evidence of significant improvement on tasks when users were provided with any of the saliency maps.
These findings suggest caution regarding the usefulness and potential for misunderstanding in saliency-based explanations.
arXiv Detail & Related papers (2023-12-10T23:13:23Z) - On the stability, correctness and plausibility of visual explanation
methods based on feature importance [0.0]
We study the articulation between the stability, correctness and plausibility of explanations based on feature importance for image classifiers.
We show that the existing metrics for evaluating these properties do not always agree, raising the issue of what constitutes a good evaluation metric for explanations.
arXiv Detail & Related papers (2023-10-25T08:59:21Z) - Counterfactuals of Counterfactuals: a back-translation-inspired approach
to analyse counterfactual editors [3.4253416336476246]
We focus on the analysis of counterfactual, contrastive explanations.
We propose a new back translation-inspired evaluation methodology.
We show that by iteratively feeding the counterfactual to the explainer we can obtain valuable insights into the behaviour of both the predictor and the explainer models.
arXiv Detail & Related papers (2023-05-26T16:04:28Z) - Logical Satisfiability of Counterfactuals for Faithful Explanations in
NLI [60.142926537264714]
We introduce the methodology of Faithfulness-through-Counterfactuals.
It generates a counterfactual hypothesis based on the logical predicates expressed in the explanation.
It then evaluates if the model's prediction on the counterfactual is consistent with that expressed logic.
arXiv Detail & Related papers (2022-05-25T03:40:59Z) - The Solvability of Interpretability Evaluation Metrics [7.3709604810699085]
Feature attribution methods are often evaluated on metrics such as comprehensiveness and sufficiency.
In this paper, we highlight an intriguing property of these metrics: their solvability.
We present a series of investigations showing that this beam search explainer is generally comparable or favorable to current choices.
arXiv Detail & Related papers (2022-05-18T02:52:03Z) - Explainability in Process Outcome Prediction: Guidelines to Obtain
Interpretable and Faithful Models [77.34726150561087]
We define explainability through the interpretability of the explanations and the faithfulness of the explainability model in the field of process outcome prediction.
This paper contributes a set of guidelines named X-MOP which allows selecting the appropriate model based on the event log specifications.
arXiv Detail & Related papers (2022-03-30T05:59:50Z) - Uncertainty Quantification of Surrogate Explanations: an Ordinal
Consensus Approach [1.3750624267664155]
We produce estimates of the uncertainty of a given explanation by measuring the consensus amongst a set of diverse bootstrapped surrogate explainers.
We empirically illustrate the properties of this approach through experiments on state-of-the-art Convolutional Neural Network ensembles.
arXiv Detail & Related papers (2021-11-17T13:55:58Z) - A Survey on the Robustness of Feature Importance and Counterfactual
Explanations [12.599872913953238]
We present a survey of the works that analysed the robustness of two classes of local explanations.
The survey aims to unify existing definitions of robustness, introduces a taxonomy to classify different robustness approaches, and discusses some interesting results.
arXiv Detail & Related papers (2021-10-30T22:48:04Z) - Toward Scalable and Unified Example-based Explanation and Outlier
Detection [128.23117182137418]
We argue for a broader adoption of prototype-based student networks capable of providing an example-based explanation for their prediction.
We show that our prototype-based networks beyond similarity kernels deliver meaningful explanations and promising outlier detection results without compromising classification accuracy.
arXiv Detail & Related papers (2020-11-11T05:58:17Z) - The Struggles of Feature-Based Explanations: Shapley Values vs. Minimal
Sufficient Subsets [61.66584140190247]
We show that feature-based explanations pose problems even for explaining trivial models.
We show that two popular classes of explainers, Shapley explainers and minimal sufficient subsets explainers, target fundamentally different types of ground-truth explanations.
arXiv Detail & Related papers (2020-09-23T09:45:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.