Uncertainty based Class Activation Maps for Visual Question Answering
- URL: http://arxiv.org/abs/2002.10309v1
- Date: Thu, 23 Jan 2020 19:54:19 GMT
- Title: Uncertainty based Class Activation Maps for Visual Question Answering
- Authors: Badri N. Patro, Mayank Lunayach and Vinay P. Namboodiri
- Abstract summary: We propose a method that obtains gradient-based certainty estimates that also provide visual attention maps.
We incorporate modern probabilistic deep learning methods that we further improve by using the gradients for these estimates.
The proposed technique can be thought of as a recipe for obtaining improved certainty estimates and explanations for deep learning models.
- Score: 30.859101872119517
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Understanding and explaining deep learning models is an imperative task.
Towards this, we propose a method that obtains gradient-based certainty
estimates that also provide visual attention maps. Particularly, we solve for
visual question answering task. We incorporate modern probabilistic deep
learning methods that we further improve by using the gradients for these
estimates. These have two-fold benefits: a) improvement in obtaining the
certainty estimates that correlate better with misclassified samples and b)
improved attention maps that provide state-of-the-art results in terms of
correlation with human attention regions. The improved attention maps result in
consistent improvement for various methods for visual question answering.
Therefore, the proposed technique can be thought of as a recipe for obtaining
improved certainty estimates and explanations for deep learning models. We
provide detailed empirical analysis for the visual question answering task on
all standard benchmarks and comparison with state of the art methods.
Related papers
- A Learning Paradigm for Interpretable Gradients [9.074325843851726]
We present a novel training approach to improve the quality of gradients for interpretability.
We find that the resulting gradient is qualitatively less noisy and improves quantitatively the interpretability properties of different networks.
arXiv Detail & Related papers (2024-04-23T13:32:29Z) - Better Understanding Differences in Attribution Methods via Systematic Evaluations [57.35035463793008]
Post-hoc attribution methods have been proposed to identify image regions most influential to the models' decisions.
We propose three novel evaluation schemes to more reliably measure the faithfulness of those methods.
We use these evaluation schemes to study strengths and shortcomings of some widely used attribution methods over a wide range of models.
arXiv Detail & Related papers (2023-03-21T14:24:58Z) - ATCON: Attention Consistency for Vision Models [0.8312466807725921]
We propose an unsupervised fine-tuning method that improves the consistency of attention maps.
We show results on Grad-CAM and Integrated Gradients in an ablation study.
Those improved attention maps may help clinicians better understand vision model predictions.
arXiv Detail & Related papers (2022-10-18T09:30:20Z) - Towards Better Understanding Attribution Methods [77.1487219861185]
Post-hoc attribution methods have been proposed to identify image regions most influential to the models' decisions.
We propose three novel evaluation schemes to more reliably measure the faithfulness of those methods.
We also propose a post-processing smoothing step that significantly improves the performance of some attribution methods.
arXiv Detail & Related papers (2022-05-20T20:50:17Z) - Bayesian Graph Contrastive Learning [55.36652660268726]
We propose a novel perspective of graph contrastive learning methods showing random augmentations leads to encoders.
Our proposed method represents each node by a distribution in the latent space in contrast to existing techniques which embed each node to a deterministic vector.
We show a considerable improvement in performance compared to existing state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-12-15T01:45:32Z) - CAMERAS: Enhanced Resolution And Sanity preserving Class Activation
Mapping for image saliency [61.40511574314069]
Backpropagation image saliency aims at explaining model predictions by estimating model-centric importance of individual pixels in the input.
We propose CAMERAS, a technique to compute high-fidelity backpropagation saliency maps without requiring any external priors.
arXiv Detail & Related papers (2021-06-20T08:20:56Z) - An Adaptive Framework for Learning Unsupervised Depth Completion [59.17364202590475]
We present a method to infer a dense depth map from a color image and associated sparse depth measurements.
We show that regularization and co-visibility are related via the fitness of the model to data and can be unified into a single framework.
arXiv Detail & Related papers (2021-06-06T02:27:55Z) - Revisiting The Evaluation of Class Activation Mapping for
Explainability: A Novel Metric and Experimental Analysis [54.94682858474711]
Class Activation Mapping (CAM) approaches provide an effective visualization by taking weighted averages of the activation maps.
We propose a novel set of metrics to quantify explanation maps, which show better effectiveness and simplify comparisons between approaches.
arXiv Detail & Related papers (2021-04-20T21:34:24Z) - Assessing the Reliability of Visual Explanations of Deep Models with
Adversarial Perturbations [15.067369314723958]
We propose an objective measure to evaluate the reliability of explanations of deep models.
Our approach is based on changes in the network's outcome resulting from the perturbation of input images in an adversarial way.
We also propose a straightforward application of our approach to clean relevance maps, creating more interpretable maps without any loss in essential explanation.
arXiv Detail & Related papers (2020-04-22T19:57:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.