Unsupervised Causal Binary Concepts Discovery with VAE for Black-box
Model Explanation
- URL: http://arxiv.org/abs/2109.04518v1
- Date: Thu, 9 Sep 2021 19:06:53 GMT
- Title: Unsupervised Causal Binary Concepts Discovery with VAE for Black-box
Model Explanation
- Authors: Thien Q. Tran, Kazuto Fukuchi, Youhei Akimoto, Jun Sakuma
- Abstract summary: We aim to explain a black-box classifier with the form: data X is classified as class Y because X textithas A, B and textitdoes not have C'
- Score: 28.990604269473657
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We aim to explain a black-box classifier with the form: `data X is classified
as class Y because X \textit{has} A, B and \textit{does not have} C' in which
A, B, and C are high-level concepts. The challenge is that we have to discover
in an unsupervised manner a set of concepts, i.e., A, B and C, that is useful
for the explaining the classifier. We first introduce a structural generative
model that is suitable to express and discover such concepts. We then propose a
learning process that simultaneously learns the data distribution and
encourages certain concepts to have a large causal influence on the classifier
output. Our method also allows easy integration of user's prior knowledge to
induce high interpretability of concepts. Using multiple datasets, we
demonstrate that our method can discover useful binary concepts for
explanation.
Related papers
- I Predict Therefore I Am: Is Next Token Prediction Enough to Learn Human-Interpretable Concepts from Data? [76.15163242945813]
Large language models (LLMs) have led many to conclude that they exhibit a form of intelligence.<n>We introduce a novel generative model that generates tokens on the basis of human-interpretable concepts represented as latent discrete variables.
arXiv Detail & Related papers (2025-03-12T01:21:17Z) - Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery [52.498055901649025]
Concept Bottleneck Models (CBMs) have been proposed to address the 'black-box' problem of deep neural networks.
We propose a novel CBM approach -- called Discover-then-Name-CBM (DN-CBM) -- that inverts the typical paradigm.
Our concept extraction strategy is efficient, since it is agnostic to the downstream task, and uses concepts already known to the model.
arXiv Detail & Related papers (2024-07-19T17:50:11Z) - Knowledge graphs for empirical concept retrieval [1.06378109904813]
Concept-based explainable AI is promising as a tool to improve the understanding of complex models at the premises of a given user.
Here, we present a workflow for user-driven data collection in both text and image domains.
We test the retrieved concept datasets on two concept-based explainability methods, namely concept activation vectors (CAVs) and concept activation regions (CARs)
arXiv Detail & Related papers (2024-04-10T13:47:22Z) - Explaining Explainability: Understanding Concept Activation Vectors [35.37586279472797]
Recent interpretability methods propose using concept-based explanations to translate internal representations of deep learning models into a language that humans are familiar with: concepts.
This requires understanding which concepts are present in the representation space of a neural network.
In this work, we investigate three properties of Concept Activation Vectors (CAVs), which are learnt using a probe dataset of concept exemplars.
We introduce tools designed to detect the presence of these properties, provide insight into how they affect the derived explanations, and provide recommendations to minimise their impact.
arXiv Detail & Related papers (2024-04-04T17:46:20Z) - Learning Interpretable Concepts: Unifying Causal Representation Learning and Foundation Models [80.32412260877628]
We study how to learn human-interpretable concepts from data.<n> Weaving together ideas from both fields, we show that concepts can be provably recovered from diverse data.
arXiv Detail & Related papers (2024-02-14T15:23:59Z) - A Geometric Notion of Causal Probing [85.49839090913515]
The linear subspace hypothesis states that, in a language model's representation space, all information about a concept such as verbal number is encoded in a linear subspace.
We give a set of intrinsic criteria which characterize an ideal linear concept subspace.
We find that, for at least one concept across two languages models, the concept subspace can be used to manipulate the concept value of the generated word with precision.
arXiv Detail & Related papers (2023-07-27T17:57:57Z) - Concept2Box: Joint Geometric Embeddings for Learning Two-View Knowledge
Graphs [77.10299848546717]
Concept2Box is a novel approach that jointly embeds the two views of a KG.
Box embeddings learn the hierarchy structure and complex relations such as overlap and disjoint among them.
We propose a novel vector-to-box distance metric and learn both embeddings jointly.
arXiv Detail & Related papers (2023-07-04T21:37:39Z) - Explaining Explainability: Towards Deeper Actionable Insights into Deep
Learning through Second-order Explainability [70.60433013657693]
Second-order explainable AI (SOXAI) was recently proposed to extend explainable AI (XAI) from the instance level to the dataset level.
We demonstrate for the first time, via example classification and segmentation cases, that eliminating irrelevant concepts from the training set based on actionable insights from SOXAI can enhance a model's performance.
arXiv Detail & Related papers (2023-06-14T23:24:01Z) - Dynamic Clue Bottlenecks: Towards Interpretable-by-Design Visual Question Answering [58.64831511644917]
We introduce an interpretable by design model that factors model decisions into intermediate human-legible explanations.
We show that our inherently interpretable system can improve 4.64% over a comparable black-box system in reasoning-focused questions.
arXiv Detail & Related papers (2023-05-24T08:33:15Z) - Towards Human-Compatible XAI: Explaining Data Differentials with Concept
Induction over Background Knowledge [2.803567242358594]
We show that concept induction can be used to explain data differentials in the context of Explainable AI (XAI)
Our approach utilizes a large class hierarchy, curated from the Wikipedia category hierarchy, as background knowledge.
arXiv Detail & Related papers (2022-09-27T21:51:27Z) - Concept-Based Explanations for Tabular Data [0.0]
We propose a concept-based explainability for Deep Neural Networks (DNNs)
We show the validity of our method in generating interpretability results that match the human-level intuitions.
We also propose a notion of fairness based on TCAV that quantifies what layer of DNN has learned representations that lead to biased predictions.
arXiv Detail & Related papers (2022-09-13T02:19:29Z) - Overlooked factors in concept-based explanations: Dataset choice,
concept learnability, and human capability [25.545486537295144]
Concept-based interpretability methods aim to explain deep neural network model predictions using a predefined set of semantic concepts.
Despite their popularity, they suffer from limitations that are not well-understood and articulated by the literature.
We analyze three commonly overlooked factors in concept-based explanations.
arXiv Detail & Related papers (2022-07-20T01:59:39Z) - DISSECT: Disentangled Simultaneous Explanations via Concept Traversals [33.65478845353047]
DISSECT is a novel approach to explaining deep learning model inferences.
By training a generative model from a classifier's signal, DISSECT offers a way to discover a classifier's inherent "notion" of distinct concepts.
We show that DISSECT produces CTs that disentangle several concepts and are coupled to its reasoning due to joint training.
arXiv Detail & Related papers (2021-05-31T17:11:56Z) - Contrastive Explanations for Model Interpretability [77.92370750072831]
We propose a methodology to produce contrastive explanations for classification models.
Our method is based on projecting model representation to a latent space.
Our findings shed light on the ability of label-contrastive explanations to provide a more accurate and finer-grained interpretability of a model's decision.
arXiv Detail & Related papers (2021-03-02T00:36:45Z) - This is not the Texture you are looking for! Introducing Novel
Counterfactual Explanations for Non-Experts using Generative Adversarial
Learning [59.17685450892182]
counterfactual explanation systems try to enable a counterfactual reasoning by modifying the input image.
We present a novel approach to generate such counterfactual image explanations based on adversarial image-to-image translation techniques.
Our results show that our approach leads to significantly better results regarding mental models, explanation satisfaction, trust, emotions, and self-efficacy than two state-of-the art systems.
arXiv Detail & Related papers (2020-12-22T10:08:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.