PACE: Posthoc Architecture-Agnostic Concept Extractor for Explaining
CNNs
- URL: http://arxiv.org/abs/2108.13828v1
- Date: Tue, 31 Aug 2021 13:36:15 GMT
- Title: PACE: Posthoc Architecture-Agnostic Concept Extractor for Explaining
CNNs
- Authors: Vidhya Kamakshi, Uday Gupta and Narayanan C Krishnan
- Abstract summary: We introduce a Posthoc Architecture-agnostic Concept Extractor (PACE) that automatically extracts smaller sub-regions of the image.
PACE tightly integrates the faithfulness of the explanatory framework to the black-box model.
The results from these experiments suggest that over 72% of the concepts extracted by PACE are human interpretable.
- Score: 3.0724051098062097
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Deep CNNs, though have achieved the state of the art performance in image
classification tasks, remain a black-box to a human using them. There is a
growing interest in explaining the working of these deep models to improve
their trustworthiness. In this paper, we introduce a Posthoc
Architecture-agnostic Concept Extractor (PACE) that automatically extracts
smaller sub-regions of the image called concepts relevant to the black-box
prediction. PACE tightly integrates the faithfulness of the explanatory
framework to the black-box model. To the best of our knowledge, this is the
first work that extracts class-specific discriminative concepts in a posthoc
manner automatically. The PACE framework is used to generate explanations for
two different CNN architectures trained for classifying the AWA2 and
Imagenet-Birds datasets. Extensive human subject experiments are conducted to
validate the human interpretability and consistency of the explanations
extracted by PACE. The results from these experiments suggest that over 72% of
the concepts extracted by PACE are human interpretable.
Related papers
- ICE: Intrinsic Concept Extraction from a Single Image via Diffusion Models [60.30998818833206]
ICE, short for Intrinsic Concept Extraction, is a novel framework for extracting intrinsic concepts from a single image.
Our framework demonstrates superior performance on intrinsic concept extraction from a single image in an unsupervised manner.
arXiv Detail & Related papers (2025-03-25T17:58:29Z) - Show and Tell: Visually Explainable Deep Neural Nets via Spatially-Aware Concept Bottleneck Models [5.985204759362746]
We present a unified framework for transforming any vision neural network into a spatially and conceptually interpretable model.
We name this method "Spatially-Aware and Label-Free Concept Bottleneck Model" (SALF-CBM)
arXiv Detail & Related papers (2025-02-27T14:27:55Z) - Visual-TCAV: Concept-based Attribution and Saliency Maps for Post-hoc Explainability in Image Classification [3.9626211140865464]
Convolutional Neural Networks (CNNs) have seen significant performance improvements in recent years.
However, due to their size and complexity, they function as black-boxes, leading to transparency concerns.
This paper introduces a novel post-hoc explainability framework, Visual-TCAV, which aims to bridge the gap between these methods.
arXiv Detail & Related papers (2024-11-08T16:52:52Z) - Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery [52.498055901649025]
Concept Bottleneck Models (CBMs) have been proposed to address the 'black-box' problem of deep neural networks.
We propose a novel CBM approach -- called Discover-then-Name-CBM (DN-CBM) -- that inverts the typical paradigm.
Our concept extraction strategy is efficient, since it is agnostic to the downstream task, and uses concepts already known to the model.
arXiv Detail & Related papers (2024-07-19T17:50:11Z) - Understanding Multimodal Deep Neural Networks: A Concept Selection View [29.08342307127578]
Concept-based models map the black-box visual representations extracted by deep neural networks onto a set of human-understandable concepts.
We propose a two-stage Concept Selection Model (CSM) to mine core concepts without introducing any human priors.
Our approach achieves comparable performance to end-to-end black-box models.
arXiv Detail & Related papers (2024-04-13T11:06:49Z) - Feature CAM: Interpretable AI in Image Classification [2.4409988934338767]
There is a lack of trust to use Artificial Intelligence in critical and high-precision fields such as security, finance, health, and manufacturing industries.
We introduce a novel technique Feature CAM, which falls in the perturbation-activation combination, to create fine-grained, class-discriminative visualizations.
The resulting saliency maps proved to be 3-4 times better human interpretable than the state-of-the-art in ABM.
arXiv Detail & Related papers (2024-03-08T20:16:00Z) - Identifying Interpretable Subspaces in Image Representations [54.821222487956355]
We propose a framework to explain features of image representations using Contrasting Concepts (FALCON)
For a target feature, FALCON captions its highly activating cropped images using a large captioning dataset and a pre-trained vision-language model like CLIP.
Each word among the captions is scored and ranked leading to a small number of shared, human-understandable concepts.
arXiv Detail & Related papers (2023-07-20T00:02:24Z) - Concept Activation Regions: A Generalized Framework For Concept-Based
Explanations [95.94432031144716]
Existing methods assume that the examples illustrating a concept are mapped in a fixed direction of the deep neural network's latent space.
In this work, we propose allowing concept examples to be scattered across different clusters in the DNN's latent space.
This concept activation region (CAR) formalism yields global concept-based explanations and local concept-based feature importance.
arXiv Detail & Related papers (2022-09-22T17:59:03Z) - FALCON: Fast Visual Concept Learning by Integrating Images, Linguistic
descriptions, and Conceptual Relations [99.54048050189971]
We present a framework for learning new visual concepts quickly, guided by multiple naturally occurring data streams.
The learned concepts support downstream applications, such as answering questions by reasoning about unseen images.
We demonstrate the effectiveness of our model on both synthetic and real-world datasets.
arXiv Detail & Related papers (2022-03-30T19:45:00Z) - CX-ToM: Counterfactual Explanations with Theory-of-Mind for Enhancing
Human Trust in Image Recognition Models [84.32751938563426]
We propose a new explainable AI (XAI) framework for explaining decisions made by a deep convolutional neural network (CNN)
In contrast to the current methods in XAI that generate explanations as a single shot response, we pose explanation as an iterative communication process.
Our framework generates sequence of explanations in a dialog by mediating the differences between the minds of machine and human user.
arXiv Detail & Related papers (2021-09-03T09:46:20Z) - This is not the Texture you are looking for! Introducing Novel
Counterfactual Explanations for Non-Experts using Generative Adversarial
Learning [59.17685450892182]
counterfactual explanation systems try to enable a counterfactual reasoning by modifying the input image.
We present a novel approach to generate such counterfactual image explanations based on adversarial image-to-image translation techniques.
Our results show that our approach leads to significantly better results regarding mental models, explanation satisfaction, trust, emotions, and self-efficacy than two state-of-the art systems.
arXiv Detail & Related papers (2020-12-22T10:08:05Z) - MACE: Model Agnostic Concept Extractor for Explaining Image
Classification Networks [10.06397994266945]
We propose MACE: a Model Agnostic Concept Extractor, which can explain the working of a convolutional network through smaller concepts.
We validate our framework using VGG16 and ResNet50 CNN architectures, and on datasets like Animals With Attributes 2 (AWA2) and Places365.
arXiv Detail & Related papers (2020-11-03T04:40:49Z) - Black Box Explanation by Learning Image Exemplars in the Latent Feature
Space [20.16179026989117]
We present an approach to explain the decisions of black box models for image classification.
Our method exploits the latent feature space learned through an adversarial autoencoder.
We show that the proposed method outperforms existing explainers in terms of fidelity, relevance, coherence, and stability.
arXiv Detail & Related papers (2020-01-27T15:42:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.