LFI-CAM: Learning Feature Importance for Better Visual Explanation
- URL: http://arxiv.org/abs/2105.00937v1
- Date: Mon, 3 May 2021 15:12:21 GMT
- Title: LFI-CAM: Learning Feature Importance for Better Visual Explanation
- Authors: Kwang Hee Lee, Chaewon Park, Junghyun Oh, Nojun Kwak
- Abstract summary: Class Activation Mapping (CAM) is a powerful technique used to understand the decision making of Convolutional Neural Network (CNN) in computer vision.
We propose a novel architecture, LFI-CAM, which is trainable for image classification and visual explanation in an end-to-end manner.
- Score: 31.743421292094308
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Class Activation Mapping (CAM) is a powerful technique used to understand the
decision making of Convolutional Neural Network (CNN) in computer vision.
Recently, there have been attempts not only to generate better visual
explanations, but also to improve classification performance using visual
explanations. However, the previous works still have their own drawbacks. In
this paper, we propose a novel architecture, LFI-CAM, which is trainable for
image classification and visual explanation in an end-to-end manner. LFI-CAM
generates an attention map for visual explanation during forward propagation,
at the same time, leverages the attention map to improve the classification
performance through the attention mechanism. Our Feature Importance Network
(FIN) focuses on learning the feature importance instead of directly learning
the attention map to obtain a more reliable and consistent attention map. We
confirmed that LFI-CAM model is optimized not only by learning the feature
importance but also by enhancing the backbone feature representation to focus
more on important features of the input image. Experimental results show that
LFI-CAM outperforms the baseline models's accuracy on the classification tasks
as well as significantly improves on the previous works in terms of attention
map quality and stability over different hyper-parameters.
Related papers
- On the Surprising Effectiveness of Attention Transfer for Vision Transformers [118.83572030360843]
Conventional wisdom suggests that pre-training Vision Transformers (ViT) improves downstream performance by learning useful representations.
We investigate this question and find that the features and representations learned during pre-training are not essential.
arXiv Detail & Related papers (2024-11-14T18:59:40Z) - BroadCAM: Outcome-agnostic Class Activation Mapping for Small-scale
Weakly Supervised Applications [69.22739434619531]
We propose an outcome-agnostic CAM approach, called BroadCAM, for small-scale weakly supervised applications.
By evaluating BroadCAM on VOC2012 and BCSS-WSSS for WSSS and OpenImages30k for WSOL, BroadCAM demonstrates superior performance.
arXiv Detail & Related papers (2023-09-07T06:45:43Z) - Learning Visual Explanations for DCNN-Based Image Classifiers Using an
Attention Mechanism [8.395400675921515]
Two new learning-based AI (XAI) methods for deep convolutional neural network (DCNN) image classifiers, called L-CAM-Fm and L-CAM-Img, are proposed.
Both methods use an attention mechanism that is inserted in the original (frozen) DCNN and is trained to derive class activation maps (CAMs) from the last convolutional layer's feature maps.
Experimental evaluation on ImageNet shows that the proposed methods achieve competitive results while requiring a single forward pass at the inference stage.
arXiv Detail & Related papers (2022-09-22T17:33:18Z) - TDAN: Top-Down Attention Networks for Enhanced Feature Selectivity in
CNNs [18.24779045808196]
We propose a lightweight top-down (TD) attention module that iteratively generates a "visual searchlight" to perform top-down channel and spatial modulation of its inputs.
Our models are more robust to changes in input resolution during inference and learn to "shift attention" by localizing individual objects or features at each computation step without any explicit supervision.
arXiv Detail & Related papers (2021-11-26T12:35:17Z) - Learning to ignore: rethinking attention in CNNs [87.01305532842878]
We propose to reformulate the attention mechanism in CNNs to learn to ignore instead of learning to attend.
Specifically, we propose to explicitly learn irrelevant information in the scene and suppress it in the produced representation.
arXiv Detail & Related papers (2021-11-10T13:47:37Z) - Towards Learning Spatially Discriminative Feature Representations [26.554140976236052]
We propose a novel loss function, termed as CAM-loss, to constrain the embedded feature maps with the class activation maps (CAMs)
CAM-loss drives the backbone to express the features of target category and suppress the features of non-target categories or background.
Experimental results show that CAM-loss is applicable to a variety of network structures and can be combined with mainstream regularization methods to improve the performance of image classification.
arXiv Detail & Related papers (2021-09-03T08:04:17Z) - Calibrating Class Activation Maps for Long-Tailed Visual Recognition [60.77124328049557]
We present two effective modifications of CNNs to improve network learning from long-tailed distribution.
First, we present a Class Activation Map (CAMC) module to improve the learning and prediction of network classifiers.
Second, we investigate the use of normalized classifiers for representation learning in long-tailed problems.
arXiv Detail & Related papers (2021-08-29T05:45:03Z) - Keep CALM and Improve Visual Feature Attribution [42.784665606132]
The class activation mapping, or CAM, has been the cornerstone of feature attribution methods for multiple vision tasks.
We improve CAM by explicitly incorporating a latent variable encoding the location of the cue for recognition in the formulation.
The resulting model, class activation latent mapping, or CALM, is trained with the expectation-maximization algorithm.
arXiv Detail & Related papers (2021-06-15T03:33:25Z) - SparseBERT: Rethinking the Importance Analysis in Self-attention [107.68072039537311]
Transformer-based models are popular for natural language processing (NLP) tasks due to its powerful capacity.
Attention map visualization of a pre-trained model is one direct method for understanding self-attention mechanism.
We propose a Differentiable Attention Mask (DAM) algorithm, which can be also applied in guidance of SparseBERT design.
arXiv Detail & Related papers (2021-02-25T14:13:44Z) - Eigen-CAM: Class Activation Map using Principal Components [1.2691047660244335]
This paper builds on previous ideas to cope with the increasing demand for interpretable, robust, and transparent models.
The proposed Eigen-CAM computes and visualizes the principle components of the learned features/representations from the convolutional layers.
arXiv Detail & Related papers (2020-08-01T17:14:13Z) - Deep Reinforced Attention Learning for Quality-Aware Visual Recognition [73.15276998621582]
We build upon the weakly-supervised generation mechanism of intermediate attention maps in any convolutional neural networks.
We introduce a meta critic network to evaluate the quality of attention maps in the main network.
arXiv Detail & Related papers (2020-07-13T02:44:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.