Segment Anything Model (SAM) Enhanced Pseudo Labels for Weakly
Supervised Semantic Segmentation
- URL: http://arxiv.org/abs/2305.05803v4
- Date: Fri, 3 Nov 2023 18:35:19 GMT
- Title: Segment Anything Model (SAM) Enhanced Pseudo Labels for Weakly
Supervised Semantic Segmentation
- Authors: Tianle Chen, Zheda Mai, Ruiwen Li, Wei-lun Chao
- Abstract summary: Weakly supervised semantic segmentation (WSSS) aims to bypass the need for laborious pixel-level annotation by using only image-level annotation.
Most existing methods rely on Class Activation Maps (CAM) to derive pixel-level pseudo-labels.
We introduce a simple yet effective method harnessing the Segment Anything Model (SAM), a class-agnostic foundation model capable of producing fine-grained instance masks of objects, parts, and subparts.
- Score: 30.812323329239614
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Weakly supervised semantic segmentation (WSSS) aims to bypass the need for
laborious pixel-level annotation by using only image-level annotation. Most
existing methods rely on Class Activation Maps (CAM) to derive pixel-level
pseudo-labels and use them to train a fully supervised semantic segmentation
model. Although these pseudo-labels are class-aware, indicating the coarse
regions for particular classes, they are not object-aware and fail to delineate
accurate object boundaries. To address this, we introduce a simple yet
effective method harnessing the Segment Anything Model (SAM), a class-agnostic
foundation model capable of producing fine-grained instance masks of objects,
parts, and subparts. We use CAM pseudo-labels as cues to select and combine SAM
masks, resulting in high-quality pseudo-labels that are both class-aware and
object-aware. Our approach is highly versatile and can be easily integrated
into existing WSSS methods without any modification. Despite its simplicity,
our approach shows consistent gain over the state-of-the-art WSSS methods on
both PASCAL VOC and MS-COCO datasets.
Related papers
- Enhancing Weakly Supervised Semantic Segmentation with Multi-modal Foundation Models: An End-to-End Approach [7.012760526318993]
Weakly-Supervised Semantic (WSSS) offers a cost-efficient workaround to extensive labeling.
Existing WSSS methods have difficulties in learning the boundaries of objects leading to poor segmentation results.
We propose a novel and effective framework that addresses these issues by leveraging visual foundation models inside the bounding box.
arXiv Detail & Related papers (2024-05-10T16:42:25Z) - PosSAM: Panoptic Open-vocabulary Segment Anything [58.72494640363136]
PosSAM is an open-vocabulary panoptic segmentation model that unifies the strengths of the Segment Anything Model (SAM) with the vision-native CLIP model in an end-to-end framework.
We introduce a Mask-Aware Selective Ensembling (MASE) algorithm that adaptively enhances the quality of generated masks and boosts the performance of open-vocabulary classification during inference for each image.
arXiv Detail & Related papers (2024-03-14T17:55:03Z) - Semantic-SAM: Segment and Recognize Anything at Any Granularity [83.64686655044765]
We introduce Semantic-SAM, a universal image segmentation model to enable segment and recognize anything at any desired granularity.
We consolidate multiple datasets across three granularities and introduce decoupled classification for objects and parts.
For the multi-granularity capability, we propose a multi-choice learning scheme during training, enabling each click to generate masks at multiple levels.
arXiv Detail & Related papers (2023-07-10T17:59:40Z) - Multi-Granularity Denoising and Bidirectional Alignment for Weakly
Supervised Semantic Segmentation [75.32213865436442]
We propose an end-to-end multi-granularity denoising and bidirectional alignment (MDBA) model to alleviate the noisy label and multi-class generalization issues.
The MDBA model can reach the mIoU of 69.5% and 70.2% on validation and test sets for the PASCAL VOC 2012 dataset.
arXiv Detail & Related papers (2023-05-09T03:33:43Z) - Saliency Guided Inter- and Intra-Class Relation Constraints for Weakly
Supervised Semantic Segmentation [66.87777732230884]
We propose a saliency guided Inter- and Intra-Class Relation Constrained (I$2$CRC) framework to assist the expansion of the activated object regions.
We also introduce an object guided label refinement module to take a full use of both the segmentation prediction and the initial labels for obtaining superior pseudo-labels.
arXiv Detail & Related papers (2022-06-20T03:40:56Z) - Inferring the Class Conditional Response Map for Weakly Supervised
Semantic Segmentation [27.269847900950943]
We propose a class-conditional inference strategy and an activation aware mask refinement loss function to generate better pseudo labels.
Our method achieves superior WSSS results without requiring re-training of the classifier.
arXiv Detail & Related papers (2021-10-27T09:43:40Z) - Towards Single Stage Weakly Supervised Semantic Segmentation [2.28438857884398]
We present a single-stage approach to weakly supervised semantic segmentation.
We use point annotations to generate reliable, on-the-fly pseudo-masks.
We significantly outperform other SOTA WSSS methods on recent real-world datasets.
arXiv Detail & Related papers (2021-06-18T18:34:50Z) - A Closer Look at Self-training for Zero-Label Semantic Segmentation [53.4488444382874]
Being able to segment unseen classes not observed during training is an important technical challenge in deep learning.
Prior zero-label semantic segmentation works approach this task by learning visual-semantic embeddings or generative models.
We propose a consistency regularizer to filter out noisy pseudo-labels by taking the intersections of the pseudo-labels generated from different augmentations of the same image.
arXiv Detail & Related papers (2021-04-21T14:34:33Z) - Causal Intervention for Weakly-Supervised Semantic Segmentation [122.1846968696862]
We aim to generate better pixel-level pseudo-masks by using only image-level labels.
We propose a structural causal model to analyze the causalities among images, contexts, and class labels.
Based on it, we develop a new method: Context Adjustment (CONTA), to remove the confounding bias in image-level classification.
arXiv Detail & Related papers (2020-09-26T09:26:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.