Generalizing Adversarial Explanations with Grad-CAM
- URL: http://arxiv.org/abs/2204.05427v1
- Date: Mon, 11 Apr 2022 22:09:21 GMT
- Title: Generalizing Adversarial Explanations with Grad-CAM
- Authors: Tanmay Chakraborty, Utkarsh Trehan, Khawla Mallat, and Jean-Luc
Dugelay
- Abstract summary: We present a novel method that extends Grad-CAM from example-based explanations to a method for explaining global model behaviour.
For our experiment, we study adversarial attacks on deep models such as VGG16, ResNet50, and ResNet101, and wide models such as InceptionNetv3 and XceptionNet.
The proposed method can be used to understand adversarial attacks and explain the behaviour of black box CNN models for image analysis.
- Score: 7.165984630575092
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Gradient-weighted Class Activation Mapping (Grad- CAM), is an example-based
explanation method that provides a gradient activation heat map as an
explanation for Convolution Neural Network (CNN) models. The drawback of this
method is that it cannot be used to generalize CNN behaviour. In this paper, we
present a novel method that extends Grad-CAM from example-based explanations to
a method for explaining global model behaviour. This is achieved by introducing
two new metrics, (i) Mean Observed Dissimilarity (MOD) and (ii) Variation in
Dissimilarity (VID), for model generalization. These metrics are computed by
comparing a Normalized Inverted Structural Similarity Index (NISSIM) metric of
the Grad-CAM generated heatmap for samples from the original test set and
samples from the adversarial test set. For our experiment, we study adversarial
attacks on deep models such as VGG16, ResNet50, and ResNet101, and wide models
such as InceptionNetv3 and XceptionNet using Fast Gradient Sign Method (FGSM).
We then compute the metrics MOD and VID for the automatic face recognition
(AFR) use case with the VGGFace2 dataset. We observe a consistent shift in the
region highlighted in the Grad-CAM heatmap, reflecting its participation to the
decision making, across all models under adversarial attacks. The proposed
method can be used to understand adversarial attacks and explain the behaviour
of black box CNN models for image analysis.
Related papers
- CAM-Based Methods Can See through Walls [6.356330972370584]
We show that most CAM-based interpretability methods can incorrectly attribute an important score to parts of the image that the model cannot see.
We train a VGG-like model constrained to not use the lower part of the image and observe positive scores in the unseen part of the image.
This behavior is evaluated quantitatively on two new datasets.
arXiv Detail & Related papers (2024-04-02T13:57:30Z) - COSE: A Consistency-Sensitivity Metric for Saliency on Image
Classification [21.3855970055692]
We present a set of metrics that utilize vision priors to assess the performance of saliency methods on image classification tasks.
We show that although saliency methods are thought to be architecture-independent, most methods could better explain transformer-based models over convolutional-based models.
arXiv Detail & Related papers (2023-09-20T01:06:44Z) - An Explainable Model-Agnostic Algorithm for CNN-based Biometrics
Verification [55.28171619580959]
This paper describes an adaptation of the Local Interpretable Model-Agnostic Explanations (LIME) AI method to operate under a biometric verification setting.
arXiv Detail & Related papers (2023-07-25T11:51:14Z) - Generalizing Backpropagation for Gradient-Based Interpretability [103.2998254573497]
We show that the gradient of a model is a special case of a more general formulation using semirings.
This observation allows us to generalize the backpropagation algorithm to efficiently compute other interpretable statistics.
arXiv Detail & Related papers (2023-07-06T15:19:53Z) - ContraFeat: Contrasting Deep Features for Semantic Discovery [102.4163768995288]
StyleGAN has shown strong potential for disentangled semantic control.
Existing semantic discovery methods on StyleGAN rely on manual selection of modified latent layers to obtain satisfactory manipulation results.
We propose a model that automates this process and achieves state-of-the-art semantic discovery performance.
arXiv Detail & Related papers (2022-12-14T15:22:13Z) - Adaptive Convolutional Dictionary Network for CT Metal Artifact
Reduction [62.691996239590125]
We propose an adaptive convolutional dictionary network (ACDNet) for metal artifact reduction.
Our ACDNet can automatically learn the prior for artifact-free CT images via training data and adaptively adjust the representation kernels for each input CT image.
Our method inherits the clear interpretability of model-based methods and maintains the powerful representation ability of learning-based methods.
arXiv Detail & Related papers (2022-05-16T06:49:36Z) - Integrated Grad-CAM: Sensitivity-Aware Visual Explanation of Deep
Convolutional Networks via Integrated Gradient-Based Scoring [26.434705114982584]
Grad-CAM is a popular solution that provides such a visualization by combining the activation maps obtained from the model.
We introduce a solution to tackle this problem by computing the path integral of the gradient-based terms in Grad-CAM.
We conduct a thorough analysis to demonstrate the improvement achieved by our method in measuring the importance of the extracted representations for the CNN's predictions.
arXiv Detail & Related papers (2021-02-15T19:21:46Z) - Use HiResCAM instead of Grad-CAM for faithful explanations of
convolutional neural networks [89.56292219019163]
Explanation methods facilitate the development of models that learn meaningful concepts and avoid exploiting spurious correlations.
We illustrate a previously unrecognized limitation of the popular neural network explanation method Grad-CAM.
We propose HiResCAM, a class-specific explanation method that is guaranteed to highlight only the locations the model used to make each prediction.
arXiv Detail & Related papers (2020-11-17T19:26:14Z) - Eigen-CAM: Class Activation Map using Principal Components [1.2691047660244335]
This paper builds on previous ideas to cope with the increasing demand for interpretable, robust, and transparent models.
The proposed Eigen-CAM computes and visualizes the principle components of the learned features/representations from the convolutional layers.
arXiv Detail & Related papers (2020-08-01T17:14:13Z) - Explanation-Guided Training for Cross-Domain Few-Shot Classification [96.12873073444091]
Cross-domain few-shot classification task (CD-FSC) combines few-shot classification with the requirement to generalize across domains represented by datasets.
We introduce a novel training approach for existing FSC models.
We show that explanation-guided training effectively improves the model generalization.
arXiv Detail & Related papers (2020-07-17T07:28:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.