GETAM: Gradient-weighted Element-wise Transformer Attention Map for
Weakly-supervised Semantic segmentation
- URL: http://arxiv.org/abs/2112.02841v1
- Date: Mon, 6 Dec 2021 08:02:32 GMT
- Title: GETAM: Gradient-weighted Element-wise Transformer Attention Map for
Weakly-supervised Semantic segmentation
- Authors: Weixuan Sun, Jing Zhang, Zheyuan Liu, Yiran Zhong, Nick Barnes
- Abstract summary: Class Activation Map (CAM) is usually generated to provide pixel level pseudo labels.
Transformer based methods are highly effective at exploring global context with long range dependency modeling.
GETAM shows fine scale activation for all feature map elements, revealing different parts of the object across transformer layers.
- Score: 29.184608129848105
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Weakly Supervised Semantic Segmentation (WSSS) is challenging, particularly
when image-level labels are used to supervise pixel level prediction. To bridge
their gap, a Class Activation Map (CAM) is usually generated to provide pixel
level pseudo labels. CAMs in Convolutional Neural Networks suffer from partial
activation ie, only the most discriminative regions are activated. Transformer
based methods, on the other hand, are highly effective at exploring global
context with long range dependency modeling, potentially alleviating the
"partial activation" issue. In this paper, we propose the first transformer
based WSSS approach, and introduce the Gradient weighted Element wise
Transformer Attention Map (GETAM). GETAM shows fine scale activation for all
feature map elements, revealing different parts of the object across
transformer layers. Further, we propose an activation aware label completion
module to generate high quality pseudo labels. Finally, we incorporate our
methods into an end to end framework for WSSS using double backward
propagation. Extensive experiments on PASCAL VOC and COCO demonstrate that our
results beat the state-of-the-art end-to-end approaches by a significant
margin, and outperform most multi-stage methods.m most multi-stage methods.
Related papers
- Background Activation Suppression for Weakly Supervised Object
Localization and Semantic Segmentation [84.62067728093358]
Weakly supervised object localization and semantic segmentation aim to localize objects using only image-level labels.
New paradigm has emerged by generating a foreground prediction map to achieve pixel-level localization.
This paper presents two astonishing experimental observations on the object localization learning process.
arXiv Detail & Related papers (2023-09-22T15:44:10Z) - All-pairs Consistency Learning for Weakly Supervised Semantic
Segmentation [42.66269050864235]
We propose a new transformer-based regularization to better localize objects for Weakly supervised semantic segmentation (WSSS)
We adopt vision transformers as the self-attention mechanism naturally embeds pair-wise affinity.
Our method produces noticeably better class localization maps (67.3% mIoU on PASCAL VOC train)
arXiv Detail & Related papers (2023-08-08T15:14:23Z) - UM-CAM: Uncertainty-weighted Multi-resolution Class Activation Maps for
Weakly-supervised Fetal Brain Segmentation [15.333308330432176]
We propose a novel weakly-supervised method with image-level labels based on semantic features and context information exploration.
Our proposed method outperforms state-of-the-art weakly-supervised methods with image-level labels.
arXiv Detail & Related papers (2023-06-20T12:21:13Z) - Multi-Granularity Denoising and Bidirectional Alignment for Weakly
Supervised Semantic Segmentation [75.32213865436442]
We propose an end-to-end multi-granularity denoising and bidirectional alignment (MDBA) model to alleviate the noisy label and multi-class generalization issues.
The MDBA model can reach the mIoU of 69.5% and 70.2% on validation and test sets for the PASCAL VOC 2012 dataset.
arXiv Detail & Related papers (2023-05-09T03:33:43Z) - Attention-based Class Activation Diffusion for Weakly-Supervised
Semantic Segmentation [98.306533433627]
extracting class activation maps (CAM) is a key step for weakly-supervised semantic segmentation (WSSS)
This paper proposes a new method to couple CAM and Attention matrix in a probabilistic Diffusion way, and dub it AD-CAM.
Experiments show that AD-CAM as pseudo labels can yield stronger WSSS models than the state-of-the-art variants of CAM.
arXiv Detail & Related papers (2022-11-20T10:06:32Z) - Saliency Guided Inter- and Intra-Class Relation Constraints for Weakly
Supervised Semantic Segmentation [66.87777732230884]
We propose a saliency guided Inter- and Intra-Class Relation Constrained (I$2$CRC) framework to assist the expansion of the activated object regions.
We also introduce an object guided label refinement module to take a full use of both the segmentation prediction and the initial labels for obtaining superior pseudo-labels.
arXiv Detail & Related papers (2022-06-20T03:40:56Z) - Self-supervised Image-specific Prototype Exploration for Weakly
Supervised Semantic Segmentation [72.33139350241044]
Weakly Supervised Semantic COCO (WSSS) based on image-level labels has attracted much attention due to low annotation costs.
We propose a Self-supervised Image-specific Prototype Exploration (SIPE) that consists of an Image-specific Prototype Exploration (IPE) and a General-Specific Consistency (GSC) loss.
Our SIPE achieves new state-of-the-art performance using only image-level labels.
arXiv Detail & Related papers (2022-03-06T09:01:03Z) - Learning Affinity from Attention: End-to-End Weakly-Supervised Semantic
Segmentation with Transformers [44.757309147148035]
We introduce Transformers, which naturally integrate global information, to generate more integral initial pseudo labels for end-to-end WSSS.
Motivated by the inherent consistency between the self-attention in Transformers and the semantic affinity, we propose an Affinity from Attention (AFA) module.
We also devise a Pixel-Adaptive Refinement module that incorporates low-level image appearance information to refine the pseudo labels.
arXiv Detail & Related papers (2022-03-05T06:07:17Z) - Semi-Supervised Domain Adaptation with Prototypical Alignment and
Consistency Learning [86.6929930921905]
This paper studies how much it can help address domain shifts if we further have a few target samples labeled.
To explore the full potential of landmarks, we incorporate a prototypical alignment (PA) module which calculates a target prototype for each class from the landmarks.
Specifically, we severely perturb the labeled images, making PA non-trivial to achieve and thus promoting model generalizability.
arXiv Detail & Related papers (2021-04-19T08:46:08Z) - Puzzle-CAM: Improved localization via matching partial and full features [0.5482532589225552]
Weakly-supervised semantic segmentation (WSSS) is introduced to narrow the gap for semantic segmentation performance from pixel-level supervision to image-level supervision.
Most advanced approaches are based on class activation maps (CAMs) to generate pseudo-labels to train the segmentation network.
We propose Puzzle-CAM, a process that minimizes differences between the features from separate patches and the whole image.
In experiments, Puzzle-CAM outperformed previous state-of-the-art methods using the same labels for supervision on the PASCAL VOC 2012 dataset.
arXiv Detail & Related papers (2021-01-27T08:19:38Z) - Zoom-CAM: Generating Fine-grained Pixel Annotations from Image Labels [15.664293530106637]
Zoom-CAM captures fine-grained small-scale objects for various discriminative class instances.
We focus on generating pixel-level pseudo-labels from class labels.
For weakly supervised semantic segmentation our generated pseudo-labels improve a state of the art model by 1.1%.
arXiv Detail & Related papers (2020-10-16T22:06:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.