Zoom-CAM: Generating Fine-grained Pixel Annotations from Image Labels
- URL: http://arxiv.org/abs/2010.08644v1
- Date: Fri, 16 Oct 2020 22:06:43 GMT
- Title: Zoom-CAM: Generating Fine-grained Pixel Annotations from Image Labels
- Authors: Xiangwei Shi, Seyran Khademi, Yunqiang Li, Jan van Gemert
- Abstract summary: Zoom-CAM captures fine-grained small-scale objects for various discriminative class instances.
We focus on generating pixel-level pseudo-labels from class labels.
For weakly supervised semantic segmentation our generated pseudo-labels improve a state of the art model by 1.1%.
- Score: 15.664293530106637
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current weakly supervised object localization and segmentation rely on
class-discriminative visualization techniques to generate pseudo-labels for
pixel-level training. Such visualization methods, including class activation
mapping (CAM) and Grad-CAM, use only the deepest, lowest resolution
convolutional layer, missing all information in intermediate layers. We propose
Zoom-CAM: going beyond the last lowest resolution layer by integrating the
importance maps over all activations in intermediate layers. Zoom-CAM captures
fine-grained small-scale objects for various discriminative class instances,
which are commonly missed by the baseline visualization methods. We focus on
generating pixel-level pseudo-labels from class labels. The quality of our
pseudo-labels evaluated on the ImageNet localization task exhibits more than
2.8% improvement on top-1 error. For weakly supervised semantic segmentation
our generated pseudo-labels improve a state of the art model by 1.1%.
Related papers
- Learning Camouflaged Object Detection from Noisy Pseudo Label [60.9005578956798]
This paper introduces the first weakly semi-supervised Camouflaged Object Detection (COD) method.
It aims for budget-efficient and high-precision camouflaged object segmentation with an extremely limited number of fully labeled images.
We propose a noise correction loss that facilitates the model's learning of correct pixels in the early learning stage.
When using only 20% of fully labeled data, our method shows superior performance over the state-of-the-art methods.
arXiv Detail & Related papers (2024-07-18T04:53:51Z) - Segment Anything Model (SAM) Enhanced Pseudo Labels for Weakly
Supervised Semantic Segmentation [30.812323329239614]
Weakly supervised semantic segmentation (WSSS) aims to bypass the need for laborious pixel-level annotation by using only image-level annotation.
Most existing methods rely on Class Activation Maps (CAM) to derive pixel-level pseudo-labels.
We introduce a simple yet effective method harnessing the Segment Anything Model (SAM), a class-agnostic foundation model capable of producing fine-grained instance masks of objects, parts, and subparts.
arXiv Detail & Related papers (2023-05-09T23:24:09Z) - Multi-Granularity Denoising and Bidirectional Alignment for Weakly
Supervised Semantic Segmentation [75.32213865436442]
We propose an end-to-end multi-granularity denoising and bidirectional alignment (MDBA) model to alleviate the noisy label and multi-class generalization issues.
The MDBA model can reach the mIoU of 69.5% and 70.2% on validation and test sets for the PASCAL VOC 2012 dataset.
arXiv Detail & Related papers (2023-05-09T03:33:43Z) - High-fidelity Pseudo-labels for Boosting Weakly-Supervised Segmentation [17.804090651425955]
Image-level weakly-supervised segmentation (WSSS) reduces the usually vast data annotation cost by surrogate segmentation masks during training.
Our work is based on two techniques for improving CAMs; importance sampling, which is a substitute for GAP, and the feature similarity loss.
We reformulate both techniques based on binomial posteriors of multiple independent binary problems.
This has two benefits; their performance is improved and they become more general, resulting in an add-on method that can boost virtually any WSSS method.
arXiv Detail & Related papers (2023-04-05T17:43:57Z) - Saliency Guided Inter- and Intra-Class Relation Constraints for Weakly
Supervised Semantic Segmentation [66.87777732230884]
We propose a saliency guided Inter- and Intra-Class Relation Constrained (I$2$CRC) framework to assist the expansion of the activated object regions.
We also introduce an object guided label refinement module to take a full use of both the segmentation prediction and the initial labels for obtaining superior pseudo-labels.
arXiv Detail & Related papers (2022-06-20T03:40:56Z) - Inferring the Class Conditional Response Map for Weakly Supervised
Semantic Segmentation [27.269847900950943]
We propose a class-conditional inference strategy and an activation aware mask refinement loss function to generate better pseudo labels.
Our method achieves superior WSSS results without requiring re-training of the classifier.
arXiv Detail & Related papers (2021-10-27T09:43:40Z) - Mixed Supervision Learning for Whole Slide Image Classification [88.31842052998319]
We propose a mixed supervision learning framework for super high-resolution images.
During the patch training stage, this framework can make use of coarse image-level labels to refine self-supervised learning.
A comprehensive strategy is proposed to suppress pixel-level false positives and false negatives.
arXiv Detail & Related papers (2021-07-02T09:46:06Z) - Semantic Segmentation with Generative Models: Semi-Supervised Learning
and Strong Out-of-Domain Generalization [112.68171734288237]
We propose a novel framework for discriminative pixel-level tasks using a generative model of both images and labels.
We learn a generative adversarial network that captures the joint image-label distribution and is trained efficiently using a large set of unlabeled images.
We demonstrate strong in-domain performance compared to several baselines, and are the first to showcase extreme out-of-domain generalization.
arXiv Detail & Related papers (2021-04-12T21:41:25Z) - Inter-Image Communication for Weakly Supervised Localization [77.2171924626778]
Weakly supervised localization aims at finding target object regions using only image-level supervision.
We propose to leverage pixel-level similarities across different objects for learning more accurate object locations.
Our method achieves the Top-1 localization error rate of 45.17% on the ILSVRC validation set.
arXiv Detail & Related papers (2020-08-12T04:14:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.