G-CAME: Gaussian-Class Activation Mapping Explainer for Object Detectors
- URL: http://arxiv.org/abs/2306.03400v1
- Date: Tue, 6 Jun 2023 04:30:18 GMT
- Title: G-CAME: Gaussian-Class Activation Mapping Explainer for Object Detectors
- Authors: Quoc Khanh Nguyen, Truong Thanh Hung Nguyen, Vo Thanh Khang Nguyen,
Van Binh Truong, Quoc Hung Cao
- Abstract summary: G-CAME generates a saliency map as the explanation for object detection models.
We evaluated our method with YOLOX on the MS-COCO 2017 dataset and guided to apply G-CAME into the two-stage Faster-RCNN model.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Nowadays, deep neural networks for object detection in images are very
prevalent. However, due to the complexity of these networks, users find it hard
to understand why these objects are detected by models. We proposed Gaussian
Class Activation Mapping Explainer (G-CAME), which generates a saliency map as
the explanation for object detection models. G-CAME can be considered a
CAM-based method that uses the activation maps of selected layers combined with
the Gaussian kernel to highlight the important regions in the image for the
predicted box. Compared with other Region-based methods, G-CAME can transcend
time constraints as it takes a very short time to explain an object. We also
evaluated our method qualitatively and quantitatively with YOLOX on the MS-COCO
2017 dataset and guided to apply G-CAME into the two-stage Faster-RCNN model.
Related papers
- Hierarchical Graph Interaction Transformer with Dynamic Token Clustering for Camouflaged Object Detection [57.883265488038134]
We propose a hierarchical graph interaction network termed HGINet for camouflaged object detection.
The network is capable of discovering imperceptible objects via effective graph interaction among the hierarchical tokenized features.
Our experiments demonstrate the superior performance of HGINet compared to existing state-of-the-art methods.
arXiv Detail & Related papers (2024-08-27T12:53:25Z) - Leveraging Activations for Superpixel Explanations [2.8792218859042453]
Saliency methods have become standard in the explanation toolkit of deep neural networks.
In this paper, we aim to avoid relying on segmenters by extracting a segmentation from the activations of a deep neural network image classifier.
Our so-called Neuro-Activated Superpixels (NAS) can isolate the regions of interest in the input relevant to the model's prediction.
arXiv Detail & Related papers (2024-06-07T13:37:45Z) - Efficient and Concise Explanations for Object Detection with Gaussian-Class Activation Mapping Explainer [3.2766072866432867]
We introduce the Gaussian Class Activation Mapping Explainer (G-CAME)
G-CAME significantly reduces explanation time to 0.5 seconds without compromising the quality.
Our evaluation of G-CAME, using Faster-RCNN and YOLOX on the MS-COCO 2017 dataset, demonstrates its ability to offer highly plausible and faithful explanations.
arXiv Detail & Related papers (2024-04-20T16:11:47Z) - Frequency Perception Network for Camouflaged Object Detection [51.26386921922031]
We propose a novel learnable and separable frequency perception mechanism driven by the semantic hierarchy in the frequency domain.
Our entire network adopts a two-stage model, including a frequency-guided coarse localization stage and a detail-preserving fine localization stage.
Compared with the currently existing models, our proposed method achieves competitive performance in three popular benchmark datasets.
arXiv Detail & Related papers (2023-08-17T11:30:46Z) - ODSmoothGrad: Generating Saliency Maps for Object Detectors [0.0]
We present ODSmoothGrad, a tool for generating saliency maps for the classification and the bounding box parameters in object detectors.
Given the noisiness of saliency maps, we also apply the SmoothGrad algorithm to visually enhance the pixels of interest.
arXiv Detail & Related papers (2023-04-15T18:21:56Z) - Learning Visual Explanations for DCNN-Based Image Classifiers Using an
Attention Mechanism [8.395400675921515]
Two new learning-based AI (XAI) methods for deep convolutional neural network (DCNN) image classifiers, called L-CAM-Fm and L-CAM-Img, are proposed.
Both methods use an attention mechanism that is inserted in the original (frozen) DCNN and is trained to derive class activation maps (CAMs) from the last convolutional layer's feature maps.
Experimental evaluation on ImageNet shows that the proposed methods achieve competitive results while requiring a single forward pass at the inference stage.
arXiv Detail & Related papers (2022-09-22T17:33:18Z) - Poly-CAM: High resolution class activation map for convolutional neural
networks [88.29660600055715]
saliency maps derived from convolutional neural networks generally fail in localizing with accuracy the image features justifying the network prediction.
This is because those maps are either low-resolution as for CAM [Zhou et al., 2016], or smooth as for perturbation-based methods [Zeiler and Fergus, 2014], or do correspond to a large number of widespread peaky spots.
In contrast, our work proposes to combine the information from earlier network layers with the one from later layers to produce a high resolution Class Activation Map.
arXiv Detail & Related papers (2022-04-28T09:06:19Z) - Learning Hierarchical Graph Representation for Image Manipulation
Detection [50.04902159383709]
The objective of image manipulation detection is to identify and locate the manipulated regions in the images.
Recent approaches mostly adopt the sophisticated Convolutional Neural Networks (CNNs) to capture the tampering artifacts left in the images.
We propose a hierarchical Graph Convolutional Network (HGCN-Net), which consists of two parallel branches.
arXiv Detail & Related papers (2022-01-15T01:54:25Z) - TSG: Target-Selective Gradient Backprop for Probing CNN Visual Saliency [72.9106103283475]
We study the visual saliency, a.k.a. visual explanation, to interpret convolutional neural networks.
Inspired by those observations, we propose a novel visual saliency framework, termed Target-Selective Gradient (TSG) backprop.
The proposed TSG consists of two components, namely, TSG-Conv and TSG-FC, which rectify the gradients for convolutional layers and fully-connected layers, respectively.
arXiv Detail & Related papers (2021-10-11T12:00:20Z) - DS-Net: Dynamic Spatiotemporal Network for Video Salient Object
Detection [78.04869214450963]
We propose a novel dynamic temporal-temporal network (DSNet) for more effective fusion of temporal and spatial information.
We show that the proposed method achieves superior performance than state-of-the-art algorithms.
arXiv Detail & Related papers (2020-12-09T06:42:30Z) - Learning Gaussian Maps for Dense Object Detection [1.8275108630751844]
We review common and highly accurate object detection methods on the scenes where numerous similar looking objects are placed in close proximity with each other.
We show that, multi-task learning of gaussian maps along with classification and bounding box regression gives us a significant boost in accuracy over the baseline.
Our method also achieves the state of the art accuracy on the SKU110K citesku110k dataset.
arXiv Detail & Related papers (2020-04-24T17:01:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.