Interpretable Network Visualizations: A Human-in-the-Loop Approach for Post-hoc Explainability of CNN-based Image Classification
- URL: http://arxiv.org/abs/2405.03301v1
- Date: Mon, 6 May 2024 09:21:35 GMT
- Title: Interpretable Network Visualizations: A Human-in-the-Loop Approach for Post-hoc Explainability of CNN-based Image Classification
- Authors: Matteo Bianchi, Antonio De Santis, Andrea Tocchetti, Marco Brambilla,
- Abstract summary: State-of-the-art explainability methods generate saliency maps to show where a specific class is identified.
We introduce a post-hoc method that explains the entire feature extraction process of a Convolutional Neural Network.
We also show an approach to generate global explanations by aggregating labels across multiple images.
- Score: 5.087579454836169
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transparency and explainability in image classification are essential for establishing trust in machine learning models and detecting biases and errors. State-of-the-art explainability methods generate saliency maps to show where a specific class is identified, without providing a detailed explanation of the model's decision process. Striving to address such a need, we introduce a post-hoc method that explains the entire feature extraction process of a Convolutional Neural Network. These explanations include a layer-wise representation of the features the model extracts from the input. Such features are represented as saliency maps generated by clustering and merging similar feature maps, to which we associate a weight derived by generalizing Grad-CAM for the proposed methodology. To further enhance these explanations, we include a set of textual labels collected through a gamified crowdsourcing activity and processed using NLP techniques and Sentence-BERT. Finally, we show an approach to generate global explanations by aggregating labels across multiple images.
Related papers
- P-TAME: Explain Any Image Classifier with Trained Perturbations [14.31574090533474]
P-TAME (Perturbation-based Trainable Attention Mechanism for Explanations) is a model-agnostic method for explaining Deep Neural Networks (DNNs)
It generates high-resolution explanations in a single forward pass during inference.
We apply P-TAME to explain the decisions of VGG-16, ResNet-50, and ViT-B-16, three distinct and widely used image classifiers.
arXiv Detail & Related papers (2025-01-29T18:06:08Z) - COMIX: Compositional Explanations using Prototypes [46.15031477955461]
We propose a method to align machine representations with human understanding.
The proposed method, named COMIX, classifies an image by decomposing it into regions based on learned concepts.
We show that our method provides fidelity of explanations and shows that the efficiency is competitive with other inherently interpretable architectures.
arXiv Detail & Related papers (2025-01-10T15:40:31Z) - Neural Clustering based Visual Representation Learning [61.72646814537163]
Clustering is one of the most classic approaches in machine learning and data analysis.
We propose feature extraction with clustering (FEC), which views feature extraction as a process of selecting representatives from data.
FEC alternates between grouping pixels into individual clusters to abstract representatives and updating the deep features of pixels with current representatives.
arXiv Detail & Related papers (2024-03-26T06:04:50Z) - EmerDiff: Emerging Pixel-level Semantic Knowledge in Diffusion Models [52.3015009878545]
We develop an image segmentor capable of generating fine-grained segmentation maps without any additional training.
Our framework identifies semantic correspondences between image pixels and spatial locations of low-dimensional feature maps.
In extensive experiments, the produced segmentation maps are demonstrated to be well delineated and capture detailed parts of the images.
arXiv Detail & Related papers (2024-01-22T07:34:06Z) - Feature Activation Map: Visual Explanation of Deep Learning Models for
Image Classification [17.373054348176932]
In this work, a post-hoc interpretation tool named feature activation map (FAM) is proposed.
FAM can interpret deep learning models without FC layers as a classifier.
Experiments conducted on ten deep learning models for few-shot image classification, contrastive learning image classification and image retrieval tasks demonstrate the effectiveness of the proposed FAM algorithm.
arXiv Detail & Related papers (2023-07-11T05:33:46Z) - Rethinking interpretation: Input-agnostic saliency mapping of deep
visual classifiers [28.28834523468462]
Saliency methods provide post-hoc model interpretation by attributing input features to the model outputs.
We show that input-specific saliency mapping is intrinsically susceptible to misleading feature attribution.
We introduce a new perspective of input-agnostic saliency mapping that computationally estimates the high-level features attributed by the model to its outputs.
arXiv Detail & Related papers (2023-03-31T06:58:45Z) - LEAD: Self-Supervised Landmark Estimation by Aligning Distributions of
Feature Similarity [49.84167231111667]
Existing works in self-supervised landmark detection are based on learning dense (pixel-level) feature representations from an image.
We introduce an approach to enhance the learning of dense equivariant representations in a self-supervised fashion.
We show that having such a prior in the feature extractor helps in landmark detection, even under drastically limited number of annotations.
arXiv Detail & Related papers (2022-04-06T17:48:18Z) - Explaining Convolutional Neural Networks through Attribution-Based Input
Sampling and Block-Wise Feature Aggregation [22.688772441351308]
Methods based on class activation mapping and randomized input sampling have gained great popularity.
However, the attribution methods provide lower resolution and blurry explanation maps that limit their explanation power.
In this work, we collect visualization maps from multiple layers of the model based on an attribution-based input sampling technique.
We also propose a layer selection strategy that applies to the whole family of CNN-based models.
arXiv Detail & Related papers (2020-10-01T20:27:30Z) - Explanation-Guided Training for Cross-Domain Few-Shot Classification [96.12873073444091]
Cross-domain few-shot classification task (CD-FSC) combines few-shot classification with the requirement to generalize across domains represented by datasets.
We introduce a novel training approach for existing FSC models.
We show that explanation-guided training effectively improves the model generalization.
arXiv Detail & Related papers (2020-07-17T07:28:08Z) - Closed-Form Factorization of Latent Semantics in GANs [65.42778970898534]
A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images.
In this work, we examine the internal representation learned by GANs to reveal the underlying variation factors in an unsupervised manner.
We propose a closed-form factorization algorithm for latent semantic discovery by directly decomposing the pre-trained weights.
arXiv Detail & Related papers (2020-07-13T18:05:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.