Human-Centered Concept Explanations for Neural Networks
- URL: http://arxiv.org/abs/2202.12451v1
- Date: Fri, 25 Feb 2022 01:27:31 GMT
- Title: Human-Centered Concept Explanations for Neural Networks
- Authors: Chih-Kuan Yeh, Been Kim, Pradeep Ravikumar
- Abstract summary: We introduce concept explanations including the class of Concept Activation Vectors (CAV)
We then discuss approaches to automatically extract concepts, and approaches to address some of their caveats.
Finally, we discuss some case studies that showcase the utility of such concept-based explanations in synthetic settings and real world applications.
- Score: 47.71169918421306
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Understanding complex machine learning models such as deep neural networks
with explanations is crucial in various applications. Many explanations stem
from the model perspective, and may not necessarily effectively communicate why
the model is making its predictions at the right level of abstraction. For
example, providing importance weights to individual pixels in an image can only
express which parts of that particular image are important to the model, but
humans may prefer an explanation which explains the prediction by concept-based
thinking. In this work, we review the emerging area of concept based
explanations. We start by introducing concept explanations including the class
of Concept Activation Vectors (CAV) which characterize concepts using vectors
in appropriate spaces of neural activations, and discuss different properties
of useful concepts, and approaches to measure the usefulness of concept
vectors. We then discuss approaches to automatically extract concepts, and
approaches to address some of their caveats. Finally, we discuss some case
studies that showcase the utility of such concept-based explanations in
synthetic settings and real world applications.
Related papers
- LLM-assisted Concept Discovery: Automatically Identifying and Explaining Neuron Functions [15.381209058506078]
Prior works have associated concepts with neurons based on examples of concepts or a pre-defined set of concepts.
We propose to leverage multimodal large language models for automatic and open-ended concept discovery.
We validate each concept by generating examples and counterexamples and evaluating the neuron's response on this new set of images.
arXiv Detail & Related papers (2024-06-12T18:19:37Z) - A survey on Concept-based Approaches For Model Improvement [2.1516043775965565]
Concepts are known to be the thinking ground of humans.
We provide a systematic review and taxonomy of various concept representations and their discovery algorithms in Deep Neural Networks (DNNs)
We also provide details on concept-based model improvement literature marking the first comprehensive survey of these methods.
arXiv Detail & Related papers (2024-03-21T17:09:20Z) - An Axiomatic Approach to Model-Agnostic Concept Explanations [67.84000759813435]
We propose an approach to concept explanations that satisfy three natural axioms: linearity, recursivity, and similarity.
We then establish connections with previous concept explanation methods, offering insight into their varying semantic meanings.
arXiv Detail & Related papers (2024-01-12T20:53:35Z) - Concept Activation Regions: A Generalized Framework For Concept-Based
Explanations [95.94432031144716]
Existing methods assume that the examples illustrating a concept are mapped in a fixed direction of the deep neural network's latent space.
In this work, we propose allowing concept examples to be scattered across different clusters in the DNN's latent space.
This concept activation region (CAR) formalism yields global concept-based explanations and local concept-based feature importance.
arXiv Detail & Related papers (2022-09-22T17:59:03Z) - Concept Gradient: Concept-based Interpretation Without Linear Assumption [77.96338722483226]
Concept Activation Vector (CAV) relies on learning a linear relation between some latent representation of a given model and concepts.
We proposed Concept Gradient (CG), extending concept-based interpretation beyond linear concept functions.
We demonstrated CG outperforms CAV in both toy examples and real world datasets.
arXiv Detail & Related papers (2022-08-31T17:06:46Z) - ConceptDistil: Model-Agnostic Distillation of Concept Explanations [4.462334751640166]
Concept-based explanations aims to fill the model interpretability gap for non-technical humans-in-the-loop.
We propose ConceptDistil, a method to bring concept explanations to any black-box classifier using knowledge distillation.
We validate ConceptDistil in a real world use-case, showing that it is able to optimize both tasks.
arXiv Detail & Related papers (2022-05-07T08:58:54Z) - Cause and Effect: Concept-based Explanation of Neural Networks [3.883460584034766]
We take a step in the interpretability of neural networks by examining their internal representation or neuron's activations against concepts.
We propose a framework to check the existence of a causal relationship between a concept (or its negation) and task classes.
arXiv Detail & Related papers (2021-05-14T18:54:17Z) - Formalising Concepts as Grounded Abstractions [68.24080871981869]
This report shows how representation learning can be used to induce concepts from raw data.
The main technical goal of this report is to show how techniques from representation learning can be married with a lattice-theoretic formulation of conceptual spaces.
arXiv Detail & Related papers (2021-01-13T15:22:01Z) - Interpretable Visual Reasoning via Induced Symbolic Space [75.95241948390472]
We study the problem of concept induction in visual reasoning, i.e., identifying concepts and their hierarchical relationships from question-answer pairs associated with images.
We first design a new framework named object-centric compositional attention model (OCCAM) to perform the visual reasoning task with object-level visual features.
We then come up with a method to induce concepts of objects and relations using clues from the attention patterns between objects' visual features and question words.
arXiv Detail & Related papers (2020-11-23T18:21:49Z) - MACE: Model Agnostic Concept Extractor for Explaining Image
Classification Networks [10.06397994266945]
We propose MACE: a Model Agnostic Concept Extractor, which can explain the working of a convolutional network through smaller concepts.
We validate our framework using VGG16 and ResNet50 CNN architectures, and on datasets like Animals With Attributes 2 (AWA2) and Places365.
arXiv Detail & Related papers (2020-11-03T04:40:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.