Do Explanations Explain? Model Knows Best
- URL: http://arxiv.org/abs/2203.02269v1
- Date: Fri, 4 Mar 2022 12:39:29 GMT
- Title: Do Explanations Explain? Model Knows Best
- Authors: Ashkan Khakzar, Pedram Khorsandi, Rozhin Nobahari, Nassir Navab
- Abstract summary: It is a mystery which input features contribute to a neural network's output.
We propose a framework for evaluating the explanations using the neural network model itself.
- Score: 39.86131552976105
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is a mystery which input features contribute to a neural network's output.
Various explanation (feature attribution) methods are proposed in the
literature to shed light on the problem. One peculiar observation is that these
explanations (attributions) point to different features as being important. The
phenomenon raises the question, which explanation to trust? We propose a
framework for evaluating the explanations using the neural network model
itself. The framework leverages the network to generate input features that
impose a particular behavior on the output. Using the generated features, we
devise controlled experimental setups to evaluate whether an explanation method
conforms to an axiom. Thus we propose an empirical framework for axiomatic
evaluation of explanation methods. We evaluate well-known and promising
explanation solutions using the proposed framework. The framework provides a
toolset to reveal properties and drawbacks within existing and future
explanation solutions.
Related papers
- How Well Do Feature-Additive Explainers Explain Feature-Additive
Predictors? [12.993027779814478]
We ask the question: can popular feature-additive explainers (e.g., LIME, SHAP, SHAPR, MAPLE, and PDP) explain feature-additive predictors?
Herein, we evaluate such explainers on ground truth that is analytically derived from the additive structure of a model.
Our results suggest that all explainers eventually fail to correctly attribute the importance of features, especially when a decision-making process involves feature interactions.
arXiv Detail & Related papers (2023-10-27T21:16:28Z) - Explanation Selection Using Unlabeled Data for Chain-of-Thought
Prompting [80.9896041501715]
Explanations that have not been "tuned" for a task, such as off-the-shelf explanations written by nonexperts, may lead to mediocre performance.
This paper tackles the problem of how to optimize explanation-infused prompts in a blackbox fashion.
arXiv Detail & Related papers (2023-02-09T18:02:34Z) - Learning to Scaffold: Optimizing Model Explanations for Teaching [74.25464914078826]
We train models on three natural language processing and computer vision tasks.
We find that students trained with explanations extracted with our framework are able to simulate the teacher significantly more effectively than ones produced with previous methods.
arXiv Detail & Related papers (2022-04-22T16:43:39Z) - Explanatory Paradigms in Neural Networks [18.32369721322249]
We present a leap-forward expansion to the study of explainability in neural networks by considering explanations as answers to reasoning-based questions.
The answers to these questions are observed correlations, observed counterfactuals, and observed contrastive explanations respectively.
The term observed refers to the specific case of post-hoc explainability, when an explanatory technique explains the decision $P$ after a trained neural network has made the decision $P$.
arXiv Detail & Related papers (2022-02-24T00:22:11Z) - Human Interpretation of Saliency-based Explanation Over Text [65.29015910991261]
We study saliency-based explanations over textual data.
We find that people often mis-interpret the explanations.
We propose a method to adjust saliencies based on model estimates of over- and under-perception.
arXiv Detail & Related papers (2022-01-27T15:20:32Z) - Convex optimization for actionable \& plausible counterfactual
explanations [9.104557591459283]
Transparency is an essential requirement of machine learning based decision making systems that are deployed in real world.
Counterfactual explanations are a prominent instance of particular intuitive explanations of decision making systems.
In this work we enhance our previous work on convex modeling for computing counterfactual explanations by a mechanism for ensuring actionability and plausibility.
arXiv Detail & Related papers (2021-05-17T06:33:58Z) - A Taxonomy of Explainable Bayesian Networks [0.0]
We introduce a taxonomy of explainability in Bayesian networks.
We extend the existing categorisation of explainability in the model, reasoning or evidence to include explanation of decisions.
arXiv Detail & Related papers (2021-01-28T07:29:57Z) - Explanation from Specification [3.04585143845864]
We formulate an approach where the type of explanation produced is guided by a specification.
Two examples are discussed: explanations for Bayesian networks using the theory of argumentation, and explanations for graph neural networks.
The approach is motivated by a theory of explanation in the philosophy of science, and it is related to current questions in the philosophy of science on the role of machine learning.
arXiv Detail & Related papers (2020-12-13T23:27:48Z) - Evaluating Explanations: How much do explanations from the teacher aid
students? [103.05037537415811]
We formalize the value of explanations using a student-teacher paradigm that measures the extent to which explanations improve student models in learning.
Unlike many prior proposals to evaluate explanations, our approach cannot be easily gamed, enabling principled, scalable, and automatic evaluation of attributions.
arXiv Detail & Related papers (2020-12-01T23:40:21Z) - The Struggles of Feature-Based Explanations: Shapley Values vs. Minimal
Sufficient Subsets [61.66584140190247]
We show that feature-based explanations pose problems even for explaining trivial models.
We show that two popular classes of explainers, Shapley explainers and minimal sufficient subsets explainers, target fundamentally different types of ground-truth explanations.
arXiv Detail & Related papers (2020-09-23T09:45:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.