Related papers: Generative causal explanations of black-box classifiers

Generative causal explanations of black-box classifiers

URL: http://arxiv.org/abs/2006.13913v2
Date: Thu, 22 Oct 2020 17:26:18 GMT
Title: Generative causal explanations of black-box classifiers
Authors: Matthew O'Shaughnessy, Gregory Canal, Marissa Connor, Mark Davenport, Christopher Rozell
Abstract summary: We develop a method for generating causal post-hoc explanations of black-box classifiers based on a learned low-dimensional representation of the data. We then demonstrate the practical utility of our method on image recognition tasks.
Score: 15.029443432414947
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We develop a method for generating causal post-hoc explanations of black-box classifiers based on a learned low-dimensional representation of the data. The explanation is causal in the sense that changing learned latent factors produces a change in the classifier output statistics. To construct these explanations, we design a learning framework that leverages a generative model and information-theoretic measures of causal influence. Our objective function encourages both the generative model to faithfully represent the data distribution and the latent factors to have a large causal influence on the classifier output. Our method learns both global and local explanations, is compatible with any classifier that admits class probabilities and a gradient, and does not require labeled attributes or knowledge of causal structure. Using carefully controlled test cases, we provide intuition that illuminates the function of our objective. We then demonstrate the practical utility of our method on image recognition tasks.

Related papers

Identifiable Latent Neural Causal Models [82.14087963690561]
Causal representation learning seeks to uncover latent, high-level causal representations from low-level observed data. We determine the types of distribution shifts that do contribute to the identifiability of causal representations. We translate our findings into a practical algorithm, allowing for the acquisition of reliable latent causal representations.
arXiv Detail & Related papers (2024-03-23T04:13:55Z)
Causal Generative Explainers using Counterfactual Inference: A Case Study on the Morpho-MNIST Dataset [5.458813674116228]
We present a generative counterfactual inference approach to study the influence of visual features as well as causal factors. We employ visual explanation methods from OmnixAI open source toolkit to compare them with our proposed methods. This finding suggests that our methods are well-suited for generating highly interpretable counterfactual explanations on causal datasets.
arXiv Detail & Related papers (2024-01-21T04:07:48Z)
A Causal Ordering Prior for Unsupervised Representation Learning [27.18951912984905]
Causal representation learning argues that factors of variation in a dataset are, in fact, causally related. We propose a fully unsupervised representation learning method that considers a data generation process with a latent additive noise model.
arXiv Detail & Related papers (2023-07-11T18:12:05Z)
CLIMAX: An exploration of Classifier-Based Contrastive Explanations [5.381004207943597]
We propose a novel post-hoc model XAI technique that provides contrastive explanations justifying the classification of a black box. Our method, which we refer to as CLIMAX, is based on local classifiers. We show that we achieve better consistency as compared to baselines such as LIME, BayLIME, and SLIME.
arXiv Detail & Related papers (2023-07-02T22:52:58Z)
Streamlining models with explanations in the learning loop [0.0]
Several explainable AI methods allow a Machine Learning user to get insights on the classification process of a black-box model. We exploit this information to design a feature engineering phase, where we combine explanations with feature values.
arXiv Detail & Related papers (2023-02-15T16:08:32Z)
Explaining Image Classifiers Using Contrastive Counterfactuals in Generative Latent Spaces [12.514483749037998]
We introduce a novel method to generate causal and yet interpretable counterfactual explanations for image classifiers. We use this framework to obtain contrastive and causal sufficiency and necessity scores as global explanations for black-box classifiers.
arXiv Detail & Related papers (2022-06-10T17:54:46Z)
Causal Transportability for Visual Recognition [70.13627281087325]
We show that standard classifiers fail because the association between images and labels is not transportable across settings. We then show that the causal effect, which severs all sources of confounding, remains invariant across domains. This motivates us to develop an algorithm to estimate the causal effect for image classification.
arXiv Detail & Related papers (2022-04-26T15:02:11Z)
Systematic Evaluation of Causal Discovery in Visual Model Based Reinforcement Learning [76.00395335702572]
A central goal for AI and causality is the joint discovery of abstract representations and causal structure. Existing environments for studying causal induction are poorly suited for this objective because they have complicated task-specific causal graphs. In this work, our goal is to facilitate research in learning representations of high-level variables as well as causal structures among them.
arXiv Detail & Related papers (2021-07-02T05:44:56Z)
Contrastive Explanations for Model Interpretability [77.92370750072831]
We propose a methodology to produce contrastive explanations for classification models. Our method is based on projecting model representation to a latent space. Our findings shed light on the ability of label-contrastive explanations to provide a more accurate and finer-grained interpretability of a model's decision.
arXiv Detail & Related papers (2021-03-02T00:36:45Z)
Explaining Black Box Predictions and Unveiling Data Artifacts through Influence Functions [55.660255727031725]
Influence functions explain the decisions of a model by identifying influential training examples. We conduct a comparison between influence functions and common word-saliency methods on representative tasks. We develop a new measure based on influence functions that can reveal artifacts in training data.
arXiv Detail & Related papers (2020-05-14T00:45:23Z)
CausalVAE: Structured Causal Disentanglement in Variational Autoencoder [52.139696854386976]
The framework of variational autoencoder (VAE) is commonly used to disentangle independent factors from observations. We propose a new VAE based framework named CausalVAE, which includes a Causal Layer to transform independent factors into causal endogenous ones. Results show that the causal representations learned by CausalVAE are semantically interpretable, and their causal relationship as a Directed Acyclic Graph (DAG) is identified with good accuracy.
arXiv Detail & Related papers (2020-04-18T20:09:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.