From Heatmaps to Structural Explanations of Image Classifiers
- URL: http://arxiv.org/abs/2109.06365v1
- Date: Mon, 13 Sep 2021 23:39:57 GMT
- Title: From Heatmaps to Structural Explanations of Image Classifiers
- Authors: Li Fuxin, Zhongang Qi, Saeed Khorram, Vivswan Shitole, Prasad
Tadepalli, Minsuk Kahng, Alan Fern
- Abstract summary: The paper starts with describing the explainable neural network (XNN), which attempts to extract and visualize several high-level concepts purely from the deep network.
Realizing that an important missing piece is a reliable heatmap visualization tool, we have developed I-GOS and iGOS++.
Through the research process, we have learned much about insights in building deep network explanations.
- Score: 31.44267537307587
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper summarizes our endeavors in the past few years in terms of
explaining image classifiers, with the aim of including negative results and
insights we have gained. The paper starts with describing the explainable
neural network (XNN), which attempts to extract and visualize several
high-level concepts purely from the deep network, without relying on human
linguistic concepts. This helps users understand network classifications that
are less intuitive and substantially improves user performance on a difficult
fine-grained classification task of discriminating among different species of
seagulls.
Realizing that an important missing piece is a reliable heatmap visualization
tool, we have developed I-GOS and iGOS++ utilizing integrated gradients to
avoid local optima in heatmap generation, which improved the performance across
all resolutions. During the development of those visualizations, we realized
that for a significant number of images, the classifier has multiple different
paths to reach a confident prediction. This has lead to our recent development
of structured attention graphs (SAGs), an approach that utilizes beam search to
locate multiple coarse heatmaps for a single image, and compactly visualizes a
set of heatmaps by capturing how different combinations of image regions impact
the confidence of a classifier.
Through the research process, we have learned much about insights in building
deep network explanations, the existence and frequency of multiple
explanations, and various tricks of the trade that make explanations work. In
this paper, we attempt to share those insights and opinions with the readers
with the hope that some of them will be informative for future researchers on
explainable deep learning.
Related papers
- Efficient Visualization of Neural Networks with Generative Models and Adversarial Perturbations [0.0]
This paper presents a novel approach for deep visualization via a generative network, offering an improvement over existing methods.
Our model simplifies the architecture by reducing the number of networks used, requiring only a generator and a discriminator.
Our model requires less prior training knowledge and uses a non-adversarial training process, where the discriminator acts as a guide.
arXiv Detail & Related papers (2024-09-20T14:59:25Z) - Fine-grained Recognition with Learnable Semantic Data Augmentation [68.48892326854494]
Fine-grained image recognition is a longstanding computer vision challenge.
We propose diversifying the training data at the feature-level to alleviate the discriminative region loss problem.
Our method significantly improves the generalization performance on several popular classification networks.
arXiv Detail & Related papers (2023-09-01T11:15:50Z) - Masked Contrastive Graph Representation Learning for Age Estimation [44.96502862249276]
This paper utilizes the property of graph representation learning in dealing with image redundancy information.
We propose a novel Masked Contrastive Graph Representation Learning (MCGRL) method for age estimation.
Experimental results on real-world face image datasets demonstrate the superiority of our proposed method over other state-of-the-art age estimation approaches.
arXiv Detail & Related papers (2023-06-16T15:53:21Z) - Multi-modal reward for visual relationships-based image captioning [4.354364351426983]
This paper proposes a deep neural network architecture for image captioning based on fusing the visual relationships information extracted from an image's scene graph with the spatial feature maps of the image.
A multi-modal reward function is then introduced for deep reinforcement learning of the proposed network using a combination of language and vision similarities in a common embedding space.
arXiv Detail & Related papers (2023-03-19T20:52:44Z) - Shap-CAM: Visual Explanations for Convolutional Neural Networks based on
Shapley Value [86.69600830581912]
We develop a novel visual explanation method called Shap-CAM based on class activation mapping.
We demonstrate that Shap-CAM achieves better visual performance and fairness for interpreting the decision making process.
arXiv Detail & Related papers (2022-08-07T00:59:23Z) - SGUIE-Net: Semantic Attention Guided Underwater Image Enhancement with
Multi-Scale Perception [18.87163028415309]
We propose a novel underwater image enhancement network, called SGUIE-Net.
We introduce semantic information as high-level guidance across different images that share common semantic regions.
This strategy helps to achieve robust and visually pleasant enhancements to different semantic objects.
arXiv Detail & Related papers (2022-01-08T14:03:24Z) - Unsupervised Discovery of Disentangled Manifolds in GANs [74.24771216154105]
Interpretable generation process is beneficial to various image editing applications.
We propose a framework to discover interpretable directions in the latent space given arbitrary pre-trained generative adversarial networks.
arXiv Detail & Related papers (2020-11-24T02:18:08Z) - Multi-Modal Retrieval using Graph Neural Networks [1.8911962184174562]
We learn a joint vision and concept embedding in the same high-dimensional space.
We model the visual and concept relationships as a graph structure.
We also introduce a novel inference time control, based on selective neighborhood connectivity.
arXiv Detail & Related papers (2020-10-04T19:34:20Z) - Towards Deeper Graph Neural Networks [63.46470695525957]
Graph convolutions perform neighborhood aggregation and represent one of the most important graph operations.
Several recent studies attribute this performance deterioration to the over-smoothing issue.
We propose Deep Adaptive Graph Neural Network (DAGNN) to adaptively incorporate information from large receptive fields.
arXiv Detail & Related papers (2020-07-18T01:11:14Z) - Distilling Localization for Self-Supervised Representation Learning [82.79808902674282]
Contrastive learning has revolutionized unsupervised representation learning.
Current contrastive models are ineffective at localizing the foreground object.
We propose a data-driven approach for learning in variance to backgrounds.
arXiv Detail & Related papers (2020-04-14T16:29:42Z) - Image Segmentation Using Deep Learning: A Survey [58.37211170954998]
Image segmentation is a key topic in image processing and computer vision.
There has been a substantial amount of works aimed at developing image segmentation approaches using deep learning models.
arXiv Detail & Related papers (2020-01-15T21:37:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.