Semantic-aware SAM for Point-Prompted Instance Segmentation
- URL: http://arxiv.org/abs/2312.15895v2
- Date: Sun, 26 May 2024 05:19:06 GMT
- Title: Semantic-aware SAM for Point-Prompted Instance Segmentation
- Authors: Zhaoyang Wei, Pengfei Chen, Xuehui Yu, Guorong Li, Jianbin Jiao, Zhenjun Han,
- Abstract summary: In this paper, we introduce a cost-effective category-specific segmenter using Segment Anything (SAM)
To tackle this challenge, we have devised a Semantic-Aware Instance Network (SAPNet) that integrates Multiple Instance Learning (MIL) with matching capability and SAM with point prompts.
SAPNet strategically selects the most representative mask proposals generated by SAM to supervise segmentation, with a specific focus on object category information.
- Score: 29.286913777078116
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Single-point annotation in visual tasks, with the goal of minimizing labelling costs, is becoming increasingly prominent in research. Recently, visual foundation models, such as Segment Anything (SAM), have gained widespread usage due to their robust zero-shot capabilities and exceptional annotation performance. However, SAM's class-agnostic output and high confidence in local segmentation introduce 'semantic ambiguity', posing a challenge for precise category-specific segmentation. In this paper, we introduce a cost-effective category-specific segmenter using SAM. To tackle this challenge, we have devised a Semantic-Aware Instance Segmentation Network (SAPNet) that integrates Multiple Instance Learning (MIL) with matching capability and SAM with point prompts. SAPNet strategically selects the most representative mask proposals generated by SAM to supervise segmentation, with a specific focus on object category information. Moreover, we introduce the Point Distance Guidance and Box Mining Strategy to mitigate inherent challenges: 'group' and 'local' issues in weakly supervised segmentation. These strategies serve to further enhance the overall segmentation performance. The experimental results on Pascal VOC and COCO demonstrate the promising performance of our proposed SAPNet, emphasizing its semantic matching capabilities and its potential to advance point-prompted instance segmentation. The code will be made publicly available.
Related papers
- Adapting Segment Anything Model for Unseen Object Instance Segmentation [70.60171342436092]
Unseen Object Instance (UOIS) is crucial for autonomous robots operating in unstructured environments.
We propose UOIS-SAM, a data-efficient solution for the UOIS task.
UOIS-SAM integrates two key components: (i) a Heatmap-based Prompt Generator (HPG) to generate class-agnostic point prompts with precise foreground prediction, and (ii) a Hierarchical Discrimination Network (HDNet) that adapts SAM's mask decoder.
arXiv Detail & Related papers (2024-09-23T19:05:50Z) - Evaluation Study on SAM 2 for Class-agnostic Instance-level Segmentation [2.5524809198548137]
Segment Anything Model (SAM) has demonstrated powerful zero-shot segmentation performance in natural scenes.
Recently released Segment Anything Model 2 (SAM2) has further heightened researchers' expectations towards image segmentation capabilities.
This technique report can drive the emergence of SAM2-based adapters, aiming to enhance the performance ceiling of large vision models on class-agnostic instance segmentation tasks.
arXiv Detail & Related papers (2024-09-04T09:35:09Z) - SAM-CP: Marrying SAM with Composable Prompts for Versatile Segmentation [88.80792308991867]
Segment Anything model (SAM) has shown ability to group image pixels into patches, but applying it to semantic-aware segmentation still faces major challenges.
This paper presents SAM-CP, a simple approach that establishes two types of composable prompts beyond SAM and composes them for versatile segmentation.
Experiments show that SAM-CP achieves semantic, instance, and panoptic segmentation in both open and closed domains.
arXiv Detail & Related papers (2024-07-23T17:47:25Z) - PosSAM: Panoptic Open-vocabulary Segment Anything [58.72494640363136]
PosSAM is an open-vocabulary panoptic segmentation model that unifies the strengths of the Segment Anything Model (SAM) with the vision-native CLIP model in an end-to-end framework.
We introduce a Mask-Aware Selective Ensembling (MASE) algorithm that adaptively enhances the quality of generated masks and boosts the performance of open-vocabulary classification during inference for each image.
arXiv Detail & Related papers (2024-03-14T17:55:03Z) - Weakly-Supervised Concealed Object Segmentation with SAM-based Pseudo
Labeling and Multi-scale Feature Grouping [40.07070188661184]
Weakly-Supervised Concealed Object (WSCOS) aims to segment objects well blended with surrounding environments.
It is hard to distinguish concealed objects from the background due to the intrinsic similarity.
We propose a new WSCOS method to address these two challenges.
arXiv Detail & Related papers (2023-05-18T14:31:34Z) - Active Pointly-Supervised Instance Segmentation [106.38955769817747]
We present an economic active learning setting, named active pointly-supervised instance segmentation (APIS)
APIS starts with box-level annotations and iteratively samples a point within the box and asks if it falls on the object.
The model developed with these strategies yields consistent performance gain on the challenging MS-COCO dataset.
arXiv Detail & Related papers (2022-07-23T11:25:24Z) - Semantic Attention and Scale Complementary Network for Instance
Segmentation in Remote Sensing Images [54.08240004593062]
We propose an end-to-end multi-category instance segmentation model, which consists of a Semantic Attention (SEA) module and a Scale Complementary Mask Branch (SCMB)
SEA module contains a simple fully convolutional semantic segmentation branch with extra supervision to strengthen the activation of interest instances on the feature map.
SCMB extends the original single mask branch to trident mask branches and introduces complementary mask supervision at different scales.
arXiv Detail & Related papers (2021-07-25T08:53:59Z) - SegGroup: Seg-Level Supervision for 3D Instance and Semantic
Segmentation [88.22349093672975]
We design a weakly supervised point cloud segmentation algorithm that only requires clicking on one point per instance to indicate its location for annotation.
With over-segmentation for pre-processing, we extend these location annotations into segments as seg-level labels.
We show that our seg-level supervised method (SegGroup) achieves comparable results with the fully annotated point-level supervised methods.
arXiv Detail & Related papers (2020-12-18T13:23:34Z) - SASO: Joint 3D Semantic-Instance Segmentation via Multi-scale Semantic
Association and Salient Point Clustering Optimization [8.519716460338518]
We propose a novel 3D point cloud segmentation framework named SASO, which jointly performs semantic and instance segmentation tasks.
For semantic segmentation task, inspired by the inherent correlation among objects in spatial context, we propose a Multi-scale Semantic Association (MSA) module.
For instance segmentation task, different from previous works that utilize clustering only in inference procedure, we propose a Salient Point Clustering Optimization (SPCO) module.
arXiv Detail & Related papers (2020-06-25T08:55:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.