Related papers: Concept Activation Regions: A Generalized Framework For Concept-Based Explanations

Concept Activation Regions: A Generalized Framework For Concept-Based Explanations

URL: http://arxiv.org/abs/2209.11222v1
Date: Thu, 22 Sep 2022 17:59:03 GMT
Title: Concept Activation Regions: A Generalized Framework For Concept-Based Explanations
Authors: Jonathan Crabb\'e and Mihaela van der Schaar
Abstract summary: Existing methods assume that the examples illustrating a concept are mapped in a fixed direction of the deep neural network's latent space. In this work, we propose allowing concept examples to be scattered across different clusters in the DNN's latent space. This concept activation region (CAR) formalism yields global concept-based explanations and local concept-based feature importance.
Score: 95.94432031144716
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Concept-based explanations permit to understand the predictions of a deep neural network (DNN) through the lens of concepts specified by users. Existing methods assume that the examples illustrating a concept are mapped in a fixed direction of the DNN's latent space. When this holds true, the concept can be represented by a concept activation vector (CAV) pointing in that direction. In this work, we propose to relax this assumption by allowing concept examples to be scattered across different clusters in the DNN's latent space. Each concept is then represented by a region of the DNN's latent space that includes these clusters and that we call concept activation region (CAR). To formalize this idea, we introduce an extension of the CAV formalism that is based on the kernel trick and support vector classifiers. This CAR formalism yields global concept-based explanations and local concept-based feature importance. We prove that CAR explanations built with radial kernels are invariant under latent space isometries. In this way, CAR assigns the same explanations to latent spaces that have the same geometry. We further demonstrate empirically that CARs offer (1) more accurate descriptions of how concepts are scattered in the DNN's latent space; (2) global explanations that are closer to human concept annotations and (3) concept-based feature importance that meaningfully relate concepts with each other. Finally, we use CARs to show that DNNs can autonomously rediscover known scientific concepts, such as the prostate cancer grading system.

Related papers

Concept-Based Explainable Artificial Intelligence: Metrics and Benchmarks [0.0]
Concept-based explanation methods aim to improve the interpretability of machine learning models. We propose three metrics: the concept global importance metric, the concept existence metric, and the concept location metric. We demonstrate that, in many cases, even the most important concepts determined by post-hoc CBMs are not present in input images.
arXiv Detail & Related papers (2025-01-31T16:32:36Z)
OmniPrism: Learning Disentangled Visual Concept for Image Generation [57.21097864811521]
Creative visual concept generation often draws inspiration from specific concepts in a reference image to produce relevant outcomes. We propose OmniPrism, a visual concept disentangling approach for creative image generation. Our method learns disentangled concept representations guided by natural language and trains a diffusion model to incorporate these concepts.
arXiv Detail & Related papers (2024-12-16T18:59:52Z)
Visual-TCAV: Concept-based Attribution and Saliency Maps for Post-hoc Explainability in Image Classification [3.9626211140865464]
Convolutional Neural Networks (CNNs) have seen significant performance improvements in recent years. However, due to their size and complexity, they function as black-boxes, leading to transparency concerns. This paper introduces a novel post-hoc explainability framework, Visual-TCAV, which aims to bridge the gap between these methods.
arXiv Detail & Related papers (2024-11-08T16:52:52Z)
Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery [52.498055901649025]
Concept Bottleneck Models (CBMs) have been proposed to address the 'black-box' problem of deep neural networks. We propose a novel CBM approach -- called Discover-then-Name-CBM (DN-CBM) -- that inverts the typical paradigm. Our concept extraction strategy is efficient, since it is agnostic to the downstream task, and uses concepts already known to the model.
arXiv Detail & Related papers (2024-07-19T17:50:11Z)
Local Concept Embeddings for Analysis of Concept Distributions in DNN Feature Spaces [1.0923877073891446]
We propose a novel concept analysis framework for deep neural networks (DNNs) Instead of optimizing a single global concept vector on the complete dataset, it generates a local concept embedding (LoCE) vector for each individual sample. Despite its context sensitivity, our method's concept segmentation performance is competitive to global baselines.
arXiv Detail & Related papers (2023-11-24T12:22:00Z)
Implicit Concept Removal of Diffusion Models [92.55152501707995]
Text-to-image (T2I) diffusion models often inadvertently generate unwanted concepts such as watermarks and unsafe images. We present the Geom-Erasing, a novel concept removal method based on the geometric-driven control.
arXiv Detail & Related papers (2023-10-09T17:13:10Z)
A Geometric Notion of Causal Probing [85.49839090913515]
The linear subspace hypothesis states that, in a language model's representation space, all information about a concept such as verbal number is encoded in a linear subspace. We give a set of intrinsic criteria which characterize an ideal linear concept subspace. We find that, for at least one concept across two languages models, the concept subspace can be used to manipulate the concept value of the generated word with precision.
arXiv Detail & Related papers (2023-07-27T17:57:57Z)
Emergence of Concepts in DNNs? [0.0]
It is examined, first, how existing methods actually identify concepts that are supposedly represented in DNNs. Second, it is discussed how conceptual spaces are shaped by a tradeoff between predictive accuracy and compression.
arXiv Detail & Related papers (2022-11-11T11:25:39Z)
Concept Gradient: Concept-based Interpretation Without Linear Assumption [77.96338722483226]
Concept Activation Vector (CAV) relies on learning a linear relation between some latent representation of a given model and concepts. We proposed Concept Gradient (CG), extending concept-based interpretation beyond linear concept functions. We demonstrated CG outperforms CAV in both toy examples and real world datasets.
arXiv Detail & Related papers (2022-08-31T17:06:46Z)
Sparse Subspace Clustering for Concept Discovery (SSCCD) [1.7319807100654885]
Concepts are key building blocks of higher level human understanding. Local attribution methods do not allow to identify coherent model behavior across samples. We put forward a new definition of concepts as low-dimensional subspaces of hidden feature layers.
arXiv Detail & Related papers (2022-03-11T16:15:48Z)
Human-Centered Concept Explanations for Neural Networks [47.71169918421306]
We introduce concept explanations including the class of Concept Activation Vectors (CAV) We then discuss approaches to automatically extract concepts, and approaches to address some of their caveats. Finally, we discuss some case studies that showcase the utility of such concept-based explanations in synthetic settings and real world applications.
arXiv Detail & Related papers (2022-02-25T01:27:31Z)
Formalising Concepts as Grounded Abstractions [68.24080871981869]
This report shows how representation learning can be used to induce concepts from raw data. The main technical goal of this report is to show how techniques from representation learning can be married with a lattice-theoretic formulation of conceptual spaces.
arXiv Detail & Related papers (2021-01-13T15:22:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.