Consistent Explanations by Contrastive Learning
- URL: http://arxiv.org/abs/2110.00527v1
- Date: Fri, 1 Oct 2021 16:49:16 GMT
- Title: Consistent Explanations by Contrastive Learning
- Authors: Vipin Pillai, Soroush Abbasi Koohpayegani, Ashley Ouligian, Dennis
Fong, Hamed Pirsiavash
- Abstract summary: Post-hoc evaluation techniques, such as Grad-CAM, enable humans to inspect the spatial regions responsible for a particular network decision.
We introduce a novel training method to train the model to produce more consistent explanations.
We show that our method, Contrastive Grad-CAM Consistency (CGC), results in Grad-CAM interpretation heatmaps that are consistent with human annotations.
- Score: 15.80891456718324
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Understanding and explaining the decisions of neural networks are critical to
building trust, rather than relying on them as black box algorithms. Post-hoc
evaluation techniques, such as Grad-CAM, enable humans to inspect the spatial
regions responsible for a particular network decision. However, it is shown
that such explanations are not always consistent with human priors, such as
consistency across image transformations. Given an interpretation algorithm,
e.g., Grad-CAM, we introduce a novel training method to train the model to
produce more consistent explanations. Since obtaining the ground truth for a
desired model interpretation is not a well-defined task, we adopt ideas from
contrastive self-supervised learning and apply them to the interpretations of
the model rather than its embeddings. Explicitly training the network to
produce more reasonable interpretations and subsequently evaluating those
interpretations will enhance our ability to trust the network. We show that our
method, Contrastive Grad-CAM Consistency (CGC), results in Grad-CAM
interpretation heatmaps that are consistent with human annotations while still
achieving comparable classification accuracy. Moreover, since our method can be
seen as a form of regularizer, on limited-data fine-grained classification
settings, our method outperforms the baseline classification accuracy on
Caltech-Birds, Stanford Cars, VGG Flowers, and FGVC-Aircraft datasets. In
addition, because our method does not rely on annotations, it allows for the
incorporation of unlabeled data into training, which enables better
generalization of the model. Our code is publicly available.
Related papers
- TIDE: Training Locally Interpretable Domain Generalization Models Enables Test-time Correction [14.396966854171273]
We consider the problem of single-source domain generalization.
Existing methods typically rely on extensive augmentations to synthetically cover diverse domains during training.
We propose an approach that compels models to leverage such local concepts during prediction.
arXiv Detail & Related papers (2024-11-25T08:46:37Z) - InfoDisent: Explainability of Image Classification Models by Information Disentanglement [9.380255522558294]
We introduce InfoDisent, a hybrid model that combines the advantages of both approaches.
By utilizing an information bottleneck, InfoDisent disentangles the information in the final layer of a pre-trained deep network.
We validate the effectiveness of InfoDisent on benchmark datasets such as ImageNet, CUB-200-2011, Stanford Cars, and Stanford Dogs.
arXiv Detail & Related papers (2024-09-16T14:39:15Z) - Match me if you can: Semi-Supervised Semantic Correspondence Learning with Unpaired Images [76.47980643420375]
This paper builds on the hypothesis that there is an inherent data-hungry matter in learning semantic correspondences.
We demonstrate a simple machine annotator reliably enriches paired key points via machine supervision.
Our models surpass current state-of-the-art models on semantic correspondence learning benchmarks like SPair-71k, PF-PASCAL, and PF-WILLOW.
arXiv Detail & Related papers (2023-11-30T13:22:15Z) - Globally Interpretable Graph Learning via Distribution Matching [12.885580925389352]
We aim to answer an important question that is not yet well studied: how to provide a global interpretation for the graph learning procedure?
We formulate this problem as globally interpretable graph learning, which targets on distilling high-level and human-intelligible patterns that dominate the learning procedure.
We propose a novel model fidelity metric, tailored for evaluating the fidelity of the resulting model trained on interpretations.
arXiv Detail & Related papers (2023-06-18T00:50:36Z) - Adaptive Convolutional Dictionary Network for CT Metal Artifact
Reduction [62.691996239590125]
We propose an adaptive convolutional dictionary network (ACDNet) for metal artifact reduction.
Our ACDNet can automatically learn the prior for artifact-free CT images via training data and adaptively adjust the representation kernels for each input CT image.
Our method inherits the clear interpretability of model-based methods and maintains the powerful representation ability of learning-based methods.
arXiv Detail & Related papers (2022-05-16T06:49:36Z) - AutoProtoNet: Interpretability for Prototypical Networks [0.0]
We introduce AutoProtoNet, which builds interpretability into Prototypical Networks.
We demonstrate how points in this embedding space can be visualized and used to understand class representations.
We also devise a prototype refinement method, which allows a human to debug inadequate classification parameters.
arXiv Detail & Related papers (2022-04-02T19:42:03Z) - Autoencoding Variational Autoencoder [56.05008520271406]
We study the implications of this behaviour on the learned representations and also the consequences of fixing it by introducing a notion of self consistency.
We show that encoders trained with our self-consistency approach lead to representations that are robust (insensitive) to perturbations in the input introduced by adversarial attacks.
arXiv Detail & Related papers (2020-12-07T14:16:14Z) - Interpreting Graph Neural Networks for NLP With Differentiable Edge
Masking [63.49779304362376]
Graph neural networks (GNNs) have become a popular approach to integrating structural inductive biases into NLP models.
We introduce a post-hoc method for interpreting the predictions of GNNs which identifies unnecessary edges.
We show that we can drop a large proportion of edges without deteriorating the performance of the model.
arXiv Detail & Related papers (2020-10-01T17:51:19Z) - Explanation-Guided Training for Cross-Domain Few-Shot Classification [96.12873073444091]
Cross-domain few-shot classification task (CD-FSC) combines few-shot classification with the requirement to generalize across domains represented by datasets.
We introduce a novel training approach for existing FSC models.
We show that explanation-guided training effectively improves the model generalization.
arXiv Detail & Related papers (2020-07-17T07:28:08Z) - Explainable Deep Classification Models for Domain Generalization [94.43131722655617]
Explanations are defined as regions of visual evidence upon which a deep classification network makes a decision.
Our training strategy enforces a periodic saliency-based feedback to encourage the model to focus on the image regions that directly correspond to the ground-truth object.
arXiv Detail & Related papers (2020-03-13T22:22:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.