Towards Learning Spatially Discriminative Feature Representations
- URL: http://arxiv.org/abs/2109.01359v1
- Date: Fri, 3 Sep 2021 08:04:17 GMT
- Title: Towards Learning Spatially Discriminative Feature Representations
- Authors: Chaofei Wang, Jiayu Xiao, Yizeng Han, Qisen Yang, Shiji Song, Gao
Huang
- Abstract summary: We propose a novel loss function, termed as CAM-loss, to constrain the embedded feature maps with the class activation maps (CAMs)
CAM-loss drives the backbone to express the features of target category and suppress the features of non-target categories or background.
Experimental results show that CAM-loss is applicable to a variety of network structures and can be combined with mainstream regularization methods to improve the performance of image classification.
- Score: 26.554140976236052
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The backbone of traditional CNN classifier is generally considered as a
feature extractor, followed by a linear layer which performs the
classification. We propose a novel loss function, termed as CAM-loss, to
constrain the embedded feature maps with the class activation maps (CAMs) which
indicate the spatially discriminative regions of an image for particular
categories. CAM-loss drives the backbone to express the features of target
category and suppress the features of non-target categories or background, so
as to obtain more discriminative feature representations. It can be simply
applied in any CNN architecture with neglectable additional parameters and
calculations. Experimental results show that CAM-loss is applicable to a
variety of network structures and can be combined with mainstream
regularization methods to improve the performance of image classification. The
strong generalization ability of CAM-loss is validated in the transfer learning
and few shot learning tasks. Based on CAM-loss, we also propose a novel
CAAM-CAM matching knowledge distillation method. This method directly uses the
CAM generated by the teacher network to supervise the CAAM generated by the
student network, which effectively improves the accuracy and convergence rate
of the student network.
Related papers
- Spatial Action Unit Cues for Interpretable Deep Facial Expression Recognition [55.97779732051921]
State-of-the-art classifiers for facial expression recognition (FER) lack interpretability, an important feature for end-users.
A new learning strategy is proposed to explicitly incorporate AU cues into classifier training, allowing to train deep interpretable models.
Our new strategy is generic, and can be applied to any deep CNN- or transformer-based classifier without requiring any architectural change or significant additional training time.
arXiv Detail & Related papers (2024-10-01T10:42:55Z) - BroadCAM: Outcome-agnostic Class Activation Mapping for Small-scale
Weakly Supervised Applications [69.22739434619531]
We propose an outcome-agnostic CAM approach, called BroadCAM, for small-scale weakly supervised applications.
By evaluating BroadCAM on VOC2012 and BCSS-WSSS for WSSS and OpenImages30k for WSOL, BroadCAM demonstrates superior performance.
arXiv Detail & Related papers (2023-09-07T06:45:43Z) - Feature Activation Map: Visual Explanation of Deep Learning Models for
Image Classification [17.373054348176932]
In this work, a post-hoc interpretation tool named feature activation map (FAM) is proposed.
FAM can interpret deep learning models without FC layers as a classifier.
Experiments conducted on ten deep learning models for few-shot image classification, contrastive learning image classification and image retrieval tasks demonstrate the effectiveness of the proposed FAM algorithm.
arXiv Detail & Related papers (2023-07-11T05:33:46Z) - Cluster-CAM: Cluster-Weighted Visual Interpretation of CNNs' Decision in
Image Classification [12.971559051829658]
Cluster-CAM is an effective and efficient gradient-free CNN interpretation algorithm.
We propose an artful strategy to forge a cognition-base map and cognition-scissors from clustered feature maps.
arXiv Detail & Related papers (2023-02-03T10:38:20Z) - Learning Visual Explanations for DCNN-Based Image Classifiers Using an
Attention Mechanism [8.395400675921515]
Two new learning-based AI (XAI) methods for deep convolutional neural network (DCNN) image classifiers, called L-CAM-Fm and L-CAM-Img, are proposed.
Both methods use an attention mechanism that is inserted in the original (frozen) DCNN and is trained to derive class activation maps (CAMs) from the last convolutional layer's feature maps.
Experimental evaluation on ImageNet shows that the proposed methods achieve competitive results while requiring a single forward pass at the inference stage.
arXiv Detail & Related papers (2022-09-22T17:33:18Z) - FD-CAM: Improving Faithfulness and Discriminability of Visual
Explanation for CNNs [7.956110316017118]
Class activation map (CAM) has been widely studied for visual explanation of the internal working mechanism of convolutional neural networks.
We propose a novel CAM weighting scheme, named FD-CAM, to improve both the faithfulness and discriminability of the CNN visual explanation.
arXiv Detail & Related papers (2022-06-17T14:08:39Z) - Weakly-supervised fire segmentation by visualizing intermediate CNN
layers [82.75113406937194]
Fire localization in images and videos is an important step for an autonomous system to combat fire incidents.
We consider weakly supervised segmentation of fire in images, in which only image labels are used to train the network.
We show that in the case of fire segmentation, which is a binary segmentation problem, the mean value of features in a mid-layer of classification CNN can perform better than conventional Class Activation Mapping (CAM) method.
arXiv Detail & Related papers (2021-11-16T11:56:28Z) - Calibrating Class Activation Maps for Long-Tailed Visual Recognition [60.77124328049557]
We present two effective modifications of CNNs to improve network learning from long-tailed distribution.
First, we present a Class Activation Map (CAMC) module to improve the learning and prediction of network classifiers.
Second, we investigate the use of normalized classifiers for representation learning in long-tailed problems.
arXiv Detail & Related papers (2021-08-29T05:45:03Z) - Knowledge Distillation By Sparse Representation Matching [107.87219371697063]
We propose Sparse Representation Matching (SRM) to transfer intermediate knowledge from one Convolutional Network (CNN) to another by utilizing sparse representation.
We formulate as a neural processing block, which can be efficiently optimized using gradient descent and integrated into any CNN in a plug-and-play manner.
Our experiments demonstrate that is robust to architectural differences between the teacher and student networks, and outperforms other KD techniques across several datasets.
arXiv Detail & Related papers (2021-03-31T11:47:47Z) - Learning CNN filters from user-drawn image markers for coconut-tree
image classification [78.42152902652215]
We present a method that needs a minimal set of user-selected images to train the CNN's feature extractor.
The method learns the filters of each convolutional layer from user-drawn markers in image regions that discriminate classes.
It does not rely on optimization based on backpropagation, and we demonstrate its advantages on the binary classification of coconut-tree aerial images.
arXiv Detail & Related papers (2020-08-08T15:50:23Z) - Eigen-CAM: Class Activation Map using Principal Components [1.2691047660244335]
This paper builds on previous ideas to cope with the increasing demand for interpretable, robust, and transparent models.
The proposed Eigen-CAM computes and visualizes the principle components of the learned features/representations from the convolutional layers.
arXiv Detail & Related papers (2020-08-01T17:14:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.