SCOUT: Self-aware Discriminant Counterfactual Explanations
- URL: http://arxiv.org/abs/2004.07769v1
- Date: Thu, 16 Apr 2020 17:05:49 GMT
- Title: SCOUT: Self-aware Discriminant Counterfactual Explanations
- Authors: Pei Wang, Nuno Vasconcelos
- Abstract summary: The problem of counterfactual visual explanations is considered.
A new family of discriminant explanations is introduced.
The resulting counterfactual explanations are optimization free and thus much faster than previous methods.
- Score: 78.79534272979305
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The problem of counterfactual visual explanations is considered. A new family
of discriminant explanations is introduced. These produce heatmaps that
attribute high scores to image regions informative of a classifier prediction
but not of a counter class. They connect attributive explanations, which are
based on a single heat map, to counterfactual explanations, which account for
both predicted class and counter class. The latter are shown to be computable
by combination of two discriminant explanations, with reversed class pairs. It
is argued that self-awareness, namely the ability to produce classification
confidence scores, is important for the computation of discriminant
explanations, which seek to identify regions where it is easy to discriminate
between prediction and counter class. This suggests the computation of
discriminant explanations by the combination of three attribution maps. The
resulting counterfactual explanations are optimization free and thus much
faster than previous methods. To address the difficulty of their evaluation, a
proxy task and set of quantitative metrics are also proposed. Experiments under
this protocol show that the proposed counterfactual explanations outperform the
state of the art while achieving much higher speeds, for popular networks. In a
human-learning machine teaching experiment, they are also shown to improve mean
student accuracy from chance level to 95\%.
Related papers
- Selective Explanations [14.312717332216073]
A machine learning model is trained to predict feature attribution scores with only one inference.
Despite their efficiency, amortized explainers can produce inaccurate predictions and misleading explanations.
We propose selective explanations, a novel feature attribution method that detects when amortized explainers generate low-quality explanations.
arXiv Detail & Related papers (2024-05-29T23:08:31Z) - Explanation Selection Using Unlabeled Data for Chain-of-Thought
Prompting [80.9896041501715]
Explanations that have not been "tuned" for a task, such as off-the-shelf explanations written by nonexperts, may lead to mediocre performance.
This paper tackles the problem of how to optimize explanation-infused prompts in a blackbox fashion.
arXiv Detail & Related papers (2023-02-09T18:02:34Z) - The Unreliability of Explanations in Few-Shot In-Context Learning [50.77996380021221]
We focus on two NLP tasks that involve reasoning over text, namely question answering and natural language inference.
We show that explanations judged as good by humans--those that are logically consistent with the input--usually indicate more accurate predictions.
We present a framework for calibrating model predictions based on the reliability of the explanations.
arXiv Detail & Related papers (2022-05-06T17:57:58Z) - Counterfactual Explanations via Latent Space Projection and
Interpolation [0.0]
We introduce SharpShooter, a method for binary classification that starts by creating a projected version of the input that classifies as the target class.
We then demonstrate that our framework translates core characteristics of a sample to its counterfactual through the use of learned representations.
arXiv Detail & Related papers (2021-12-02T00:07:49Z) - Correcting Classification: A Bayesian Framework Using Explanation
Feedback to Improve Classification Abilities [2.0931163605360115]
Explanations are social, meaning they are a transfer of knowledge through interactions.
We overcome these difficulties by training a Bayesian convolutional neural network (CNN) that uses explanation feedback.
Our proposed method utilizes this feedback for fine-tuning to correct the model such that the explanations and classifications improve.
arXiv Detail & Related papers (2021-04-29T13:59:21Z) - Contrastive Explanations for Model Interpretability [77.92370750072831]
We propose a methodology to produce contrastive explanations for classification models.
Our method is based on projecting model representation to a latent space.
Our findings shed light on the ability of label-contrastive explanations to provide a more accurate and finer-grained interpretability of a model's decision.
arXiv Detail & Related papers (2021-03-02T00:36:45Z) - This is not the Texture you are looking for! Introducing Novel
Counterfactual Explanations for Non-Experts using Generative Adversarial
Learning [59.17685450892182]
counterfactual explanation systems try to enable a counterfactual reasoning by modifying the input image.
We present a novel approach to generate such counterfactual image explanations based on adversarial image-to-image translation techniques.
Our results show that our approach leads to significantly better results regarding mental models, explanation satisfaction, trust, emotions, and self-efficacy than two state-of-the art systems.
arXiv Detail & Related papers (2020-12-22T10:08:05Z) - Evaluating Explanations: How much do explanations from the teacher aid
students? [103.05037537415811]
We formalize the value of explanations using a student-teacher paradigm that measures the extent to which explanations improve student models in learning.
Unlike many prior proposals to evaluate explanations, our approach cannot be easily gamed, enabling principled, scalable, and automatic evaluation of attributions.
arXiv Detail & Related papers (2020-12-01T23:40:21Z) - On Generating Plausible Counterfactual and Semi-Factual Explanations for
Deep Learning [15.965337956587373]
PlausIble Exceptionality-based Contrastive Explanations (PIECE), modifies all exceptional features in a test image to be normal from the perspective of the counterfactual class.
Two controlled experiments compare PIECE to others in the literature, showing that PIECE not only generates the most plausible counterfactuals on several measures, but also the best semifactuals.
arXiv Detail & Related papers (2020-09-10T14:48:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.