Decom--CAM: Tell Me What You See, In Details! Feature-Level Interpretation via Decomposition Class Activation Map
- URL: http://arxiv.org/abs/2306.04644v2
- Date: Wed, 29 May 2024 08:53:56 GMT
- Title: Decom--CAM: Tell Me What You See, In Details! Feature-Level Interpretation via Decomposition Class Activation Map
- Authors: Yuguang Yang, Runtang Guo, Sheng Wu, Yimi Wang, Juan Zhang, Xuan Gong, Baochang Zhang,
- Abstract summary: Class Activation Map (CAM) is widely used to interpret deep model predictions by highlighting object location.
This paper proposes a new two-stage interpretability method called the Decomposition Class Activation Map (Decom-CAM)
Our experiments demonstrate that the proposed Decom-CAM outperforms current state-of-the-art methods significantly.
- Score: 23.71680014689873
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Interpretation of deep learning remains a very challenging problem. Although the Class Activation Map (CAM) is widely used to interpret deep model predictions by highlighting object location, it fails to provide insight into the salient features used by the model to make decisions. Furthermore, existing evaluation protocols often overlook the correlation between interpretability performance and the model's decision quality, which presents a more fundamental issue. This paper proposes a new two-stage interpretability method called the Decomposition Class Activation Map (Decom-CAM), which offers a feature-level interpretation of the model's prediction. Decom-CAM decomposes intermediate activation maps into orthogonal features using singular value decomposition and generates saliency maps by integrating them. The orthogonality of features enables CAM to capture local features and can be used to pinpoint semantic components such as eyes, noses, and faces in the input image, making it more beneficial for deep model interpretation. To ensure a comprehensive comparison, we introduce a new evaluation protocol by dividing the dataset into subsets based on classification accuracy results and evaluating the interpretability performance on each subset separately. Our experiments demonstrate that the proposed Decom-CAM outperforms current state-of-the-art methods significantly by generating more precise saliency maps across all levels of classification accuracy. Combined with our feature-level interpretability approach, this paper could pave the way for a new direction for understanding the decision-making process of deep neural networks.
Related papers
- Spatial Action Unit Cues for Interpretable Deep Facial Expression Recognition [55.97779732051921]
State-of-the-art classifiers for facial expression recognition (FER) lack interpretability, an important feature for end-users.
A new learning strategy is proposed to explicitly incorporate AU cues into classifier training, allowing to train deep interpretable models.
Our new strategy is generic, and can be applied to any deep CNN- or transformer-based classifier without requiring any architectural change or significant additional training time.
arXiv Detail & Related papers (2024-10-01T10:42:55Z) - DecomCAM: Advancing Beyond Saliency Maps through Decomposition and Integration [25.299607743268993]
Class Activation Map (CAM) methods highlight regions revealing the model's decision-making basis but lack clear saliency maps and detailed interpretability.
We propose DecomCAM, a novel decomposition-and-integration method that distills shared patterns from channel activation maps.
Experiments reveal that DecomCAM not only excels in locating accuracy but also achieves an optimizing balance between interpretability and computational efficiency.
arXiv Detail & Related papers (2024-05-29T08:40:11Z) - LICO: Explainable Models with Language-Image Consistency [39.869639626266554]
This paper develops a Language-Image COnsistency model for explainable image classification, termed LICO.
We first establish a coarse global manifold structure alignment by minimizing the distance between the distributions of image and language features.
We then achieve fine-grained saliency maps by applying optimal transport (OT) theory to assign local feature maps with class-specific prompts.
arXiv Detail & Related papers (2023-10-15T12:44:33Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - Adaptive Local-Component-aware Graph Convolutional Network for One-shot
Skeleton-based Action Recognition [54.23513799338309]
We present an Adaptive Local-Component-aware Graph Convolutional Network for skeleton-based action recognition.
Our method provides a stronger representation than the global embedding and helps our model reach state-of-the-art.
arXiv Detail & Related papers (2022-09-21T02:33:07Z) - SSA: Semantic Structure Aware Inference for Weakly Pixel-Wise Dense
Predictions without Cost [36.27226683586425]
The semantic structure aware inference (SSA) is proposed to explore the semantic structure information hidden in different stages of the CNN-based network to generate high-quality CAM in the model inference.
The proposed method has the advantage of no parameters and does not need to be trained. Therefore, it can be applied to a wide range of weakly-supervised pixel-wise dense prediction tasks.
arXiv Detail & Related papers (2021-11-05T11:07:21Z) - Revisiting The Evaluation of Class Activation Mapping for
Explainability: A Novel Metric and Experimental Analysis [54.94682858474711]
Class Activation Mapping (CAM) approaches provide an effective visualization by taking weighted averages of the activation maps.
We propose a novel set of metrics to quantify explanation maps, which show better effectiveness and simplify comparisons between approaches.
arXiv Detail & Related papers (2021-04-20T21:34:24Z) - Learning structure-aware semantic segmentation with image-level
supervision [36.40302533324508]
We argue that the lost structure information in CAM limits its application in downstream semantic segmentation.
We introduce an auxiliary semantic boundary detection module, which penalizes the deteriorated predictions.
Experimental results on the PASCAL-VOC dataset illustrate the effectiveness of the proposed solution.
arXiv Detail & Related papers (2021-04-15T03:33:20Z) - Mixup-CAM: Weakly-supervised Semantic Segmentation via Uncertainty
Regularization [73.03956876752868]
We propose a principled and end-to-end train-able framework to allow the network to pay attention to other parts of the object.
Specifically, we introduce the mixup data augmentation scheme into the classification network and design two uncertainty regularization terms to better interact with the mixup strategy.
arXiv Detail & Related papers (2020-08-03T21:19:08Z) - Eigen-CAM: Class Activation Map using Principal Components [1.2691047660244335]
This paper builds on previous ideas to cope with the increasing demand for interpretable, robust, and transparent models.
The proposed Eigen-CAM computes and visualizes the principle components of the learned features/representations from the convolutional layers.
arXiv Detail & Related papers (2020-08-01T17:14:13Z) - Out-of-distribution Generalization via Partial Feature Decorrelation [72.96261704851683]
We present a novel Partial Feature Decorrelation Learning (PFDL) algorithm, which jointly optimize a feature decomposition network and the target image classification model.
The experiments on real-world datasets demonstrate that our method can improve the backbone model's accuracy on OOD image classification datasets.
arXiv Detail & Related papers (2020-07-30T05:48:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.