Black Box Explanation by Learning Image Exemplars in the Latent Feature
Space
- URL: http://arxiv.org/abs/2002.03746v1
- Date: Mon, 27 Jan 2020 15:42:14 GMT
- Title: Black Box Explanation by Learning Image Exemplars in the Latent Feature
Space
- Authors: Riccardo Guidotti, Anna Monreale, Stan Matwin, Dino Pedreschi
- Abstract summary: We present an approach to explain the decisions of black box models for image classification.
Our method exploits the latent feature space learned through an adversarial autoencoder.
We show that the proposed method outperforms existing explainers in terms of fidelity, relevance, coherence, and stability.
- Score: 20.16179026989117
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present an approach to explain the decisions of black box models for image
classification. While using the black box to label images, our explanation
method exploits the latent feature space learned through an adversarial
autoencoder. The proposed method first generates exemplar images in the latent
feature space and learns a decision tree classifier. Then, it selects and
decodes exemplars respecting local decision rules. Finally, it visualizes them
in a manner that shows to the user how the exemplars can be modified to either
stay within their class, or to become counter-factuals by "morphing" into
another class. Since we focus on black box decision systems for image
classification, the explanation obtained from the exemplars also provides a
saliency map highlighting the areas of the image that contribute to its
classification, and areas of the image that push it into another class. We
present the results of an experimental evaluation on three datasets and two
black box models. Besides providing the most useful and interpretable
explanations, we show that the proposed method outperforms existing explainers
in terms of fidelity, relevance, coherence, and stability.
Related papers
- Accurate Explanation Model for Image Classifiers using Class Association Embedding [5.378105759529487]
We propose a generative explanation model that combines the advantages of global and local knowledge.
Class association embedding (CAE) encodes each sample into a pair of separated class-associated and individual codes.
Building-block coherency feature extraction algorithm is proposed that efficiently separates class-associated features from individual ones.
arXiv Detail & Related papers (2024-06-12T07:41:00Z) - Leveraging Open-Vocabulary Diffusion to Camouflaged Instance
Segmentation [59.78520153338878]
Text-to-image diffusion techniques have shown exceptional capability of producing high-quality images from text descriptions.
We propose a method built upon a state-of-the-art diffusion model, empowered by open-vocabulary to learn multi-scale textual-visual features for camouflaged object representations.
arXiv Detail & Related papers (2023-12-29T07:59:07Z) - Identifying Interpretable Subspaces in Image Representations [54.821222487956355]
We propose a framework to explain features of image representations using Contrasting Concepts (FALCON)
For a target feature, FALCON captions its highly activating cropped images using a large captioning dataset and a pre-trained vision-language model like CLIP.
Each word among the captions is scored and ranked leading to a small number of shared, human-understandable concepts.
arXiv Detail & Related papers (2023-07-20T00:02:24Z) - What You See is What You Classify: Black Box Attributions [61.998683569022006]
We train a deep network, the Explainer, to predict attributions for a pre-trained black-box classifier, the Explanandum.
Unlike most existing approaches, ours is capable of directly generating very distinct class-specific masks.
We show that our attributions are superior to established methods both visually and quantitatively.
arXiv Detail & Related papers (2022-05-23T12:30:04Z) - LEAD: Self-Supervised Landmark Estimation by Aligning Distributions of
Feature Similarity [49.84167231111667]
Existing works in self-supervised landmark detection are based on learning dense (pixel-level) feature representations from an image.
We introduce an approach to enhance the learning of dense equivariant representations in a self-supervised fashion.
We show that having such a prior in the feature extractor helps in landmark detection, even under drastically limited number of annotations.
arXiv Detail & Related papers (2022-04-06T17:48:18Z) - Reinforcement Explanation Learning [4.852320309766702]
Black-box methods to generate saliency maps are particularly interesting due to the fact that they do not utilize the internals of the model to explain the decision.
We formulate saliency map generation as a sequential search problem and leverage upon Reinforcement Learning (RL) to accumulate evidence from input images.
Experiments on three benchmark datasets demonstrate the superiority of the proposed approach in inference time over state-of-the-arts without hurting the performance.
arXiv Detail & Related papers (2021-11-26T10:20:01Z) - White Box Methods for Explanations of Convolutional Neural Networks in
Image Classification Tasks [3.3959642559854357]
Convolutional Neural Networks (CNNs) have demonstrated state of the art performance for the task of image classification.
Several approaches have been proposed to explain to understand the reasoning behind a prediction made by a network.
We focus primarily on white box methods that leverage the information of the internal architecture of a network to explain its decision.
arXiv Detail & Related papers (2021-04-06T14:40:00Z) - This is not the Texture you are looking for! Introducing Novel
Counterfactual Explanations for Non-Experts using Generative Adversarial
Learning [59.17685450892182]
counterfactual explanation systems try to enable a counterfactual reasoning by modifying the input image.
We present a novel approach to generate such counterfactual image explanations based on adversarial image-to-image translation techniques.
Our results show that our approach leads to significantly better results regarding mental models, explanation satisfaction, trust, emotions, and self-efficacy than two state-of-the art systems.
arXiv Detail & Related papers (2020-12-22T10:08:05Z) - Distilling Localization for Self-Supervised Representation Learning [82.79808902674282]
Contrastive learning has revolutionized unsupervised representation learning.
Current contrastive models are ineffective at localizing the foreground object.
We propose a data-driven approach for learning in variance to backgrounds.
arXiv Detail & Related papers (2020-04-14T16:29:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.