Causal Explanations for Image Classifiers
- URL: http://arxiv.org/abs/2411.08875v1
- Date: Wed, 13 Nov 2024 18:52:42 GMT
- Title: Causal Explanations for Image Classifiers
- Authors: Hana Chockler, David A. Kelly, Daniel Kroening, Youcheng Sun,
- Abstract summary: We present a novel black-box approach to computing explanations grounded in the theory of actual causality.
We present an algorithm for computing approximate explanations based on these definitions.
We demonstrate that rex is the most efficient tool and produces the smallest explanations.
- Score: 17.736724129275043
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing algorithms for explaining the output of image classifiers use different definitions of explanations and a variety of techniques to extract them. However, none of the existing tools use a principled approach based on formal definitions of causes and explanations for the explanation extraction. In this paper we present a novel black-box approach to computing explanations grounded in the theory of actual causality. We prove relevant theoretical results and present an algorithm for computing approximate explanations based on these definitions. We prove termination of our algorithm and discuss its complexity and the amount of approximation compared to the precise definition. We implemented the framework in a tool rex and we present experimental results and a comparison with state-of-the-art tools. We demonstrate that rex is the most efficient tool and produces the smallest explanations, in addition to outperforming other black-box tools on standard quality measures.
Related papers
- Causal Identification of Sufficient, Contrastive and Complete Feature Sets in Image Classification [6.338178373376447]
We show that causal explanations enjoy the same formal properties as logic-based ones, while still lending themselves to black-box algorithms.<n>We augment the definition of explanation with confidence awareness and introduce complete causal explanations.<n>Our algorithms are efficiently computable, taking on average 6s per image on a ResNet50 model to compute all types of explanations.
arXiv Detail & Related papers (2025-07-31T12:33:00Z) - Selective Explanations [14.312717332216073]
A machine learning model is trained to predict feature attribution scores with only one inference.
Despite their efficiency, amortized explainers can produce inaccurate predictions and misleading explanations.
We propose selective explanations, a novel feature attribution method that detects when amortized explainers generate low-quality explanations.
arXiv Detail & Related papers (2024-05-29T23:08:31Z) - Causal Generative Explainers using Counterfactual Inference: A Case
Study on the Morpho-MNIST Dataset [5.458813674116228]
We present a generative counterfactual inference approach to study the influence of visual features as well as causal factors.
We employ visual explanation methods from OmnixAI open source toolkit to compare them with our proposed methods.
This finding suggests that our methods are well-suited for generating highly interpretable counterfactual explanations on causal datasets.
arXiv Detail & Related papers (2024-01-21T04:07:48Z) - Multiple Different Black Box Explanations for Image Classifiers [12.619223781912705]
We describe an algorithm and a tool, MultEX, for computing multiple explanations as the output of a black-box image classifier.<n>We analyze its theoretical complexity and evaluate MultEX against the state-of-the-art across three different models and three different datasets.
arXiv Detail & Related papers (2023-09-25T17:28:28Z) - Learning with Explanation Constraints [91.23736536228485]
We provide a learning theoretic framework to analyze how explanations can improve the learning of our models.
We demonstrate the benefits of our approach over a large array of synthetic and real-world experiments.
arXiv Detail & Related papers (2023-03-25T15:06:47Z) - Explanation Selection Using Unlabeled Data for Chain-of-Thought
Prompting [80.9896041501715]
Explanations that have not been "tuned" for a task, such as off-the-shelf explanations written by nonexperts, may lead to mediocre performance.
This paper tackles the problem of how to optimize explanation-infused prompts in a blackbox fashion.
arXiv Detail & Related papers (2023-02-09T18:02:34Z) - Don't Explain Noise: Robust Counterfactuals for Randomized Ensembles [50.81061839052459]
We formalize the generation of robust counterfactual explanations as a probabilistic problem.
We show the link between the robustness of ensemble models and the robustness of base learners.
Our method achieves high robustness with only a small increase in the distance from counterfactual explanations to their initial observations.
arXiv Detail & Related papers (2022-05-27T17:28:54Z) - On Efficiently Explaining Graph-Based Classifiers [16.199563506727316]
This paper shows that not only decision trees (DTs) may not be interpretable but also proposed a-time algorithm for computing one PI-explanation of a DT.
In addition, the paper also proposes a-time algorithm for computing one contrastive explanation.
arXiv Detail & Related papers (2021-06-02T17:55:41Z) - Compositional Explanations for Image Classifiers [18.24535957515688]
We present a novel, black-box algorithm for computing explanations that uses a principled approach based on causal theory.
We implement the method in the tool CET (Compositional Explanation Tool)
arXiv Detail & Related papers (2021-03-05T11:54:14Z) - Benchmarking and Survey of Explanation Methods for Black Box Models [9.747543620322956]
We provide a categorization of explanation methods based on the type of explanation returned.
We present the most recent and widely used explainers, and we show a visual comparison among explanations and a quantitative benchmarking.
arXiv Detail & Related papers (2021-02-25T18:50:29Z) - This is not the Texture you are looking for! Introducing Novel
Counterfactual Explanations for Non-Experts using Generative Adversarial
Learning [59.17685450892182]
counterfactual explanation systems try to enable a counterfactual reasoning by modifying the input image.
We present a novel approach to generate such counterfactual image explanations based on adversarial image-to-image translation techniques.
Our results show that our approach leads to significantly better results regarding mental models, explanation satisfaction, trust, emotions, and self-efficacy than two state-of-the art systems.
arXiv Detail & Related papers (2020-12-22T10:08:05Z) - A Diagnostic Study of Explainability Techniques for Text Classification [52.879658637466605]
We develop a list of diagnostic properties for evaluating existing explainability techniques.
We compare the saliency scores assigned by the explainability techniques with human annotations of salient input regions to find relations between a model's performance and the agreement of its rationales with human ones.
arXiv Detail & Related papers (2020-09-25T12:01:53Z) - Evaluations and Methods for Explanation through Robustness Analysis [117.7235152610957]
We establish a novel set of evaluation criteria for such feature based explanations by analysis.
We obtain new explanations that are loosely necessary and sufficient for a prediction.
We extend the explanation to extract the set of features that would move the current prediction to a target class.
arXiv Detail & Related papers (2020-05-31T05:52:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.