Puzzle-CAM: Improved localization via matching partial and full features
- URL: http://arxiv.org/abs/2101.11253v2
- Date: Thu, 28 Jan 2021 03:34:21 GMT
- Title: Puzzle-CAM: Improved localization via matching partial and full features
- Authors: Sanghyun Jo, In-Jae Yu
- Abstract summary: Weakly-supervised semantic segmentation (WSSS) is introduced to narrow the gap for semantic segmentation performance from pixel-level supervision to image-level supervision.
Most advanced approaches are based on class activation maps (CAMs) to generate pseudo-labels to train the segmentation network.
We propose Puzzle-CAM, a process that minimizes differences between the features from separate patches and the whole image.
In experiments, Puzzle-CAM outperformed previous state-of-the-art methods using the same labels for supervision on the PASCAL VOC 2012 dataset.
- Score: 0.5482532589225552
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Weakly-supervised semantic segmentation (WSSS) is introduced to narrow the
gap for semantic segmentation performance from pixel-level supervision to
image-level supervision. Most advanced approaches are based on class activation
maps (CAMs) to generate pseudo-labels to train the segmentation network. The
main limitation of WSSS is that the process of generating pseudo-labels from
CAMs that use an image classifier is mainly focused on the most discriminative
parts of the objects. To address this issue, we propose Puzzle-CAM, a process
that minimizes differences between the features from separate patches and the
whole image. Our method consists of a puzzle module and two regularization
terms to discover the most integrated region in an object. Puzzle-CAM can
activate the overall region of an object using image-level supervision without
requiring extra parameters. % In experiments, Puzzle-CAM outperformed previous
state-of-the-art methods using the same labels for supervision on the PASCAL
VOC 2012 test dataset. In experiments, Puzzle-CAM outperformed previous
state-of-the-art methods using the same labels for supervision on the PASCAL
VOC 2012 dataset. Code associated with our experiments is available at
\url{https://github.com/OFRIN/PuzzleCAM}.
Related papers
- Auxiliary Tasks Enhanced Dual-affinity Learning for Weakly Supervised
Semantic Segmentation [79.05949524349005]
We propose AuxSegNet+, a weakly supervised auxiliary learning framework to explore the rich information from saliency maps.
We also propose a cross-task affinity learning mechanism to learn pixel-level affinities from the saliency and segmentation feature maps.
arXiv Detail & Related papers (2024-03-02T10:03:21Z) - Multi-Granularity Denoising and Bidirectional Alignment for Weakly
Supervised Semantic Segmentation [75.32213865436442]
We propose an end-to-end multi-granularity denoising and bidirectional alignment (MDBA) model to alleviate the noisy label and multi-class generalization issues.
The MDBA model can reach the mIoU of 69.5% and 70.2% on validation and test sets for the PASCAL VOC 2012 dataset.
arXiv Detail & Related papers (2023-05-09T03:33:43Z) - Saliency Guided Inter- and Intra-Class Relation Constraints for Weakly
Supervised Semantic Segmentation [66.87777732230884]
We propose a saliency guided Inter- and Intra-Class Relation Constrained (I$2$CRC) framework to assist the expansion of the activated object regions.
We also introduce an object guided label refinement module to take a full use of both the segmentation prediction and the initial labels for obtaining superior pseudo-labels.
arXiv Detail & Related papers (2022-06-20T03:40:56Z) - Self-supervised Image-specific Prototype Exploration for Weakly
Supervised Semantic Segmentation [72.33139350241044]
Weakly Supervised Semantic COCO (WSSS) based on image-level labels has attracted much attention due to low annotation costs.
We propose a Self-supervised Image-specific Prototype Exploration (SIPE) that consists of an Image-specific Prototype Exploration (IPE) and a General-Specific Consistency (GSC) loss.
Our SIPE achieves new state-of-the-art performance using only image-level labels.
arXiv Detail & Related papers (2022-03-06T09:01:03Z) - GETAM: Gradient-weighted Element-wise Transformer Attention Map for
Weakly-supervised Semantic segmentation [29.184608129848105]
Class Activation Map (CAM) is usually generated to provide pixel level pseudo labels.
Transformer based methods are highly effective at exploring global context with long range dependency modeling.
GETAM shows fine scale activation for all feature map elements, revealing different parts of the object across transformer layers.
arXiv Detail & Related papers (2021-12-06T08:02:32Z) - TS-CAM: Token Semantic Coupled Attention Map for Weakly Supervised
Object Localization [112.46381729542658]
Weakly supervised object localization (WSOL) is a challenging problem when given image category labels.
We introduce the token semantic coupled attention map (TS-CAM) to take full advantage of the self-attention mechanism in visual transformer for long-range dependency extraction.
arXiv Detail & Related papers (2021-03-27T09:43:16Z) - Zoom-CAM: Generating Fine-grained Pixel Annotations from Image Labels [15.664293530106637]
Zoom-CAM captures fine-grained small-scale objects for various discriminative class instances.
We focus on generating pixel-level pseudo-labels from class labels.
For weakly supervised semantic segmentation our generated pseudo-labels improve a state of the art model by 1.1%.
arXiv Detail & Related papers (2020-10-16T22:06:43Z) - Self-supervised Equivariant Attention Mechanism for Weakly Supervised
Semantic Segmentation [93.83369981759996]
We propose a self-supervised equivariant attention mechanism (SEAM) to discover additional supervision and narrow the gap.
Our method is based on the observation that equivariance is an implicit constraint in fully supervised semantic segmentation.
We propose consistency regularization on predicted CAMs from various transformed images to provide self-supervision for network learning.
arXiv Detail & Related papers (2020-04-09T14:57:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.