Related papers: InfoDisent: Explainability of Image Classification Models by Information Disentanglement

InfoDisent: Explainability of Image Classification Models by Information Disentanglement

URL: http://arxiv.org/abs/2409.10329v1
Date: Mon, 16 Sep 2024 14:39:15 GMT
Title: InfoDisent: Explainability of Image Classification Models by Information Disentanglement
Authors: Łukasz Struski, Jacek Tabor,
Abstract summary: We introduce InfoDisent, a hybrid model that combines the advantages of both approaches. By utilizing an information bottleneck, InfoDisent disentangles the information in the final layer of a pre-trained deep network. We validate the effectiveness of InfoDisent on benchmark datasets such as ImageNet, CUB-200-2011, Stanford Cars, and Stanford Dogs.
Score: 9.380255522558294
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Understanding the decisions made by image classification networks is a critical area of research in deep learning. This task is traditionally divided into two distinct approaches: post-hoc methods and intrinsic methods. Post-hoc methods, such as GradCam, aim to interpret the decisions of pre-trained models by identifying regions of the image where the network focuses its attention. However, these methods provide only a high-level overview, making it difficult to fully understand the network's decision-making process. Conversely, intrinsic methods, like prototypical parts models, offer a more detailed understanding of network predictions but are constrained by specific architectures, training methods, and datasets. In this paper, we introduce InfoDisent, a hybrid model that combines the advantages of both approaches. By utilizing an information bottleneck, InfoDisent disentangles the information in the final layer of a pre-trained deep network, enabling the breakdown of classification decisions into basic, understandable atomic components. Unlike standard prototypical parts approaches, InfoDisent can interpret the decisions of pre-trained classification networks and be used for making classification decisions, similar to intrinsic models. We validate the effectiveness of InfoDisent on benchmark datasets such as ImageNet, CUB-200-2011, Stanford Cars, and Stanford Dogs for both convolutional and transformer backbones.

Related papers

SIDE: Sparse Information Disentanglement for Explainable Artificial Intelligence [9.975642488603937]
Prototypical-parts-based neural networks have emerged as a promising solution by offering concept-level explanations.<n>We introduce Sparse Information Disentanglement for Explainability (SIDE), a novel method that improves the interpretability of prototypical parts.
arXiv Detail & Related papers (2025-07-25T14:34:15Z)
Concept-Based Mechanistic Interpretability Using Structured Knowledge Graphs [3.429783703166407]
Our framework enables a global dissection of model behavior by analyzing how high-level semantic attributes emerge, interact, and propagate through internal model components.<n>A key innovation is our visualization platform that we named BAGEL, which presents these insights in a structured knowledge graph.<n>Our framework is model-agnostic, scalable, and contributes to a deeper understanding of how deep learning models generalize (or fail to) in the presence of dataset biases.
arXiv Detail & Related papers (2025-07-08T09:30:20Z)
Interpretable Image Classification via Non-parametric Part Prototype Learning [14.390730075612248]
Classifying images with an interpretable decision-making process is a long-standing problem in computer vision. In recent years, Prototypical Part Networks has gained traction as an approach for self-explainable neural networks. We present a framework for part-based interpretable image classification that learns a set of semantically distinctive object parts for each class.
arXiv Detail & Related papers (2025-03-13T10:46:53Z)
How to Probe: Simple Yet Effective Techniques for Improving Post-hoc Explanations [69.72654127617058]
Post-hoc importance attribution methods are a popular tool for "explaining" Deep Neural Networks (DNNs) In this work we bring forward empirical evidence that challenges this very notion. We discover a strong dependency on and demonstrate that the training details of a pre-trained model's classification layer play a crucial role.
arXiv Detail & Related papers (2025-03-01T22:25:11Z)
Enhancing Neural Network Interpretability Through Conductance-Based Information Plane Analysis [0.0]
The Information Plane is a conceptual framework used to analyze the flow of information in neural networks. This paper introduces a new approach that uses layer conductance, a measure of sensitivity to input features, to enhance the Information Plane analysis.
arXiv Detail & Related papers (2024-08-26T23:10:42Z)
Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images. We identify model weaknesses by testing the model using the counterfactual image dataset. We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z)
Automatic Discovery of Visual Circuits [66.99553804855931]
We explore scalable methods for extracting the subgraph of a vision model's computational graph that underlies recognition of a specific visual concept. We find that our approach extracts circuits that causally affect model output, and that editing these circuits can defend large pretrained models from adversarial attacks.
arXiv Detail & Related papers (2024-04-22T17:00:57Z)
Manipulating Feature Visualizations with Gradient Slingshots [54.31109240020007]
We introduce a novel method for manipulating Feature Visualization (FV) without significantly impacting the model's decision-making process. We evaluate the effectiveness of our method on several neural network models and demonstrate its capabilities to hide the functionality of arbitrarily chosen neurons.
arXiv Detail & Related papers (2024-01-11T18:57:17Z)
Concept backpropagation: An Explainable AI approach for visualising learned concepts in neural network models [0.0]
We present an extension to the method of concept detection, named emphconcept backpropagation, which provides a way of analysing how the information representing a given concept is internalised in a given neural network model.
arXiv Detail & Related papers (2023-07-24T08:21:13Z)
A Recursive Bateson-Inspired Model for the Generation of Semantic Formal Concepts from Spatial Sensory Data [77.34726150561087]
This paper presents a new symbolic-only method for the generation of hierarchical concept structures from complex sensory data. The approach is based on Bateson's notion of difference as the key to the genesis of an idea or a concept. The model is able to produce fairly rich yet human-readable conceptual representations without training.
arXiv Detail & Related papers (2023-07-16T15:59:13Z)
Adaptive Convolutional Dictionary Network for CT Metal Artifact Reduction [62.691996239590125]
We propose an adaptive convolutional dictionary network (ACDNet) for metal artifact reduction. Our ACDNet can automatically learn the prior for artifact-free CT images via training data and adaptively adjust the representation kernels for each input CT image. Our method inherits the clear interpretability of model-based methods and maintains the powerful representation ability of learning-based methods.
arXiv Detail & Related papers (2022-05-16T06:49:36Z)
Consistent Explanations by Contrastive Learning [15.80891456718324]
Post-hoc evaluation techniques, such as Grad-CAM, enable humans to inspect the spatial regions responsible for a particular network decision. We introduce a novel training method to train the model to produce more consistent explanations. We show that our method, Contrastive Grad-CAM Consistency (CGC), results in Grad-CAM interpretation heatmaps that are consistent with human annotations.
arXiv Detail & Related papers (2021-10-01T16:49:16Z)
Meta-learning using privileged information for dynamics [66.32254395574994]
We extend the Neural ODE Process model to use additional information within the Learning Using Privileged Information setting. We validate our extension with experiments showing improved accuracy and calibration on simulated dynamics tasks.
arXiv Detail & Related papers (2021-04-29T12:18:02Z)
White Box Methods for Explanations of Convolutional Neural Networks in Image Classification Tasks [3.3959642559854357]
Convolutional Neural Networks (CNNs) have demonstrated state of the art performance for the task of image classification. Several approaches have been proposed to explain to understand the reasoning behind a prediction made by a network. We focus primarily on white box methods that leverage the information of the internal architecture of a network to explain its decision.
arXiv Detail & Related papers (2021-04-06T14:40:00Z)
Playing to distraction: towards a robust training of CNN classifiers through visual explanation techniques [1.2321022105220707]
We present a novel and robust training scheme that integrates visual explanation techniques in the learning process. In particular, we work on the challenging EgoFoodPlaces dataset, achieving state-of-the-art results with a lower level of complexity.
arXiv Detail & Related papers (2020-12-28T10:24:32Z)
Visualization of Supervised and Self-Supervised Neural Networks via Attribution Guided Factorization [87.96102461221415]
We develop an algorithm that provides per-class explainability. In an extensive battery of experiments, we demonstrate the ability of our methods to class-specific visualization.
arXiv Detail & Related papers (2020-12-03T18:48:39Z)
Explaining Convolutional Neural Networks through Attribution-Based Input Sampling and Block-Wise Feature Aggregation [22.688772441351308]
Methods based on class activation mapping and randomized input sampling have gained great popularity. However, the attribution methods provide lower resolution and blurry explanation maps that limit their explanation power. In this work, we collect visualization maps from multiple layers of the model based on an attribution-based input sampling technique. We also propose a layer selection strategy that applies to the whole family of CNN-based models.
arXiv Detail & Related papers (2020-10-01T20:27:30Z)
Region Comparison Network for Interpretable Few-shot Image Classification [97.97902360117368]
Few-shot image classification has been proposed to effectively use only a limited number of labeled examples to train models for new classes. We propose a metric learning based method named Region Comparison Network (RCN), which is able to reveal how few-shot learning works. We also present a new way to generalize the interpretability from the level of tasks to categories.
arXiv Detail & Related papers (2020-09-08T07:29:05Z)
Explainable Recommender Systems via Resolving Learning Representations [57.24565012731325]
Explanations could help improve user experience and discover system defects. We propose a novel explainable recommendation model through improving the transparency of the representation learning process.
arXiv Detail & Related papers (2020-08-21T05:30:48Z)
Explanation-Guided Training for Cross-Domain Few-Shot Classification [96.12873073444091]
Cross-domain few-shot classification task (CD-FSC) combines few-shot classification with the requirement to generalize across domains represented by datasets. We introduce a novel training approach for existing FSC models. We show that explanation-guided training effectively improves the model generalization.
arXiv Detail & Related papers (2020-07-17T07:28:08Z)
Explainable Image Classification with Evidence Counterfactual [0.0]
We introduce SEDC as a model-agnostic instance-level explanation method for image classification. For a given image, SEDC searches a small set of segments that, in case of removal, alters the classification. We compare SEDC(-T) with popular feature importance methods such as LRP, LIME and SHAP, and we describe how the mentioned importance ranking issues are addressed.
arXiv Detail & Related papers (2020-04-16T08:02:48Z)
Deep Unfolding Network for Image Super-Resolution [159.50726840791697]
This paper proposes an end-to-end trainable unfolding network which leverages both learning-based methods and model-based methods. The proposed network inherits the flexibility of model-based methods to super-resolve blurry, noisy images for different scale factors via a single model.
arXiv Detail & Related papers (2020-03-23T17:55:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.