Customizing Segmentation Foundation Model via Prompt Learning for Instance Segmentation
- URL: http://arxiv.org/abs/2403.09199v1
- Date: Thu, 14 Mar 2024 09:13:51 GMT
- Title: Customizing Segmentation Foundation Model via Prompt Learning for Instance Segmentation
- Authors: Hyung-Il Kim, Kimin Yun, Jun-Seok Yun, Yuseok Bae,
- Abstract summary: The Segment Anything Model (SAM) stands out for its remarkable progress in generalizability and flexibility for image segmentation tasks.
We propose a novel method, customized instance segmentation via prompt learning tailored to SAM.
Our method involves a prompt learning module (PLM), which adjusts input prompts into the embedding space to better align with user intentions.
- Score: 7.6136466242670435
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, foundation models trained on massive datasets to adapt to a wide range of domains have attracted considerable attention and are actively being explored within the computer vision community. Among these, the Segment Anything Model (SAM) stands out for its remarkable progress in generalizability and flexibility for image segmentation tasks, achieved through prompt-based object mask generation. However, despite its strength, SAM faces two key limitations when applied to customized instance segmentation that segments specific objects or those in unique environments not typically present in the training data: 1) the ambiguity inherent in input prompts and 2) the necessity for extensive additional training to achieve optimal segmentation. To address these challenges, we propose a novel method, customized instance segmentation via prompt learning tailored to SAM. Our method involves a prompt learning module (PLM), which adjusts input prompts into the embedding space to better align with user intentions, thereby enabling more efficient training. Furthermore, we introduce a point matching module (PMM) to enhance the feature representation for finer segmentation by ensuring detailed alignment with ground truth boundaries. Experimental results on various customized instance segmentation scenarios demonstrate the effectiveness of the proposed method.
Related papers
- AlignSAM: Aligning Segment Anything Model to Open Context via Reinforcement Learning [61.666973416903005]
Segment Anything Model (SAM) has demonstrated its impressive generalization capabilities in open-world scenarios with the guidance of prompts.
We propose a novel framework, termed AlignSAM, designed for automatic prompting for aligning SAM to an open context.
arXiv Detail & Related papers (2024-06-01T16:21:39Z) - PosSAM: Panoptic Open-vocabulary Segment Anything [58.72494640363136]
PosSAM is an open-vocabulary panoptic segmentation model that unifies the strengths of the Segment Anything Model (SAM) with the vision-native CLIP model in an end-to-end framework.
We introduce a Mask-Aware Selective Ensembling (MASE) algorithm that adaptively enhances the quality of generated masks and boosts the performance of open-vocabulary classification during inference for each image.
arXiv Detail & Related papers (2024-03-14T17:55:03Z) - Semantic-aware SAM for Point-Prompted Instance Segmentation [29.286913777078116]
In this paper, we introduce a cost-effective category-specific segmenter using Segment Anything (SAM)
To tackle this challenge, we have devised a Semantic-Aware Instance Network (SAPNet) that integrates Multiple Instance Learning (MIL) with matching capability and SAM with point prompts.
SAPNet strategically selects the most representative mask proposals generated by SAM to supervise segmentation, with a specific focus on object category information.
arXiv Detail & Related papers (2023-12-26T05:56:44Z) - Universal Segmentation at Arbitrary Granularity with Language
Instruction [59.76130089644841]
We present UniLSeg, a universal segmentation model that can perform segmentation at any semantic level with the guidance of language instructions.
For training UniLSeg, we reorganize a group of tasks from original diverse distributions into a unified data format, where images with texts describing segmentation targets as input and corresponding masks are output.
arXiv Detail & Related papers (2023-12-04T04:47:48Z) - Semantic-SAM: Segment and Recognize Anything at Any Granularity [83.64686655044765]
We introduce Semantic-SAM, a universal image segmentation model to enable segment and recognize anything at any desired granularity.
We consolidate multiple datasets across three granularities and introduce decoupled classification for objects and parts.
For the multi-granularity capability, we propose a multi-choice learning scheme during training, enabling each click to generate masks at multiple levels.
arXiv Detail & Related papers (2023-07-10T17:59:40Z) - RSPrompter: Learning to Prompt for Remote Sensing Instance Segmentation
based on Visual Foundation Model [29.42043345787285]
We propose a method to learn the generation of appropriate prompts for Segment Anything Model (SAM)
This enables SAM to produce semantically discernible segmentation results for remote sensing images.
We also propose several ongoing derivatives for instance segmentation tasks, drawing on recent advancements within the SAM community, and compare their performance with RSPrompter.
arXiv Detail & Related papers (2023-06-28T14:51:34Z) - Weakly-Supervised Concealed Object Segmentation with SAM-based Pseudo
Labeling and Multi-scale Feature Grouping [40.07070188661184]
Weakly-Supervised Concealed Object (WSCOS) aims to segment objects well blended with surrounding environments.
It is hard to distinguish concealed objects from the background due to the intrinsic similarity.
We propose a new WSCOS method to address these two challenges.
arXiv Detail & Related papers (2023-05-18T14:31:34Z) - FreeSeg: Unified, Universal and Open-Vocabulary Image Segmentation [42.89720785573885]
FreeSeg is a generic framework to accomplish Unified, Universal and Open-Vocabulary Image.
We show that FreeSeg establishes new state-of-the-art results in performance and generalization on three segmentation tasks.
arXiv Detail & Related papers (2023-03-30T08:42:49Z) - UCP-Net: Unstructured Contour Points for Instance Segmentation [2.105564340986074]
We propose a novel approach to interactive segmentation based on unconstrained contour clicks for initial segmentation and segmentation refinement.
Our method is class-agnostic and produces accurate segmentation masks (IoU > 85%) for a lower number of user interactions than state-of-the-art methods on popular segmentation datasets.
arXiv Detail & Related papers (2021-09-15T22:03:37Z) - Semantic Attention and Scale Complementary Network for Instance
Segmentation in Remote Sensing Images [54.08240004593062]
We propose an end-to-end multi-category instance segmentation model, which consists of a Semantic Attention (SEA) module and a Scale Complementary Mask Branch (SCMB)
SEA module contains a simple fully convolutional semantic segmentation branch with extra supervision to strengthen the activation of interest instances on the feature map.
SCMB extends the original single mask branch to trident mask branches and introduces complementary mask supervision at different scales.
arXiv Detail & Related papers (2021-07-25T08:53:59Z) - Target-Aware Object Discovery and Association for Unsupervised Video
Multi-Object Segmentation [79.6596425920849]
This paper addresses the task of unsupervised video multi-object segmentation.
We introduce a novel approach for more accurate and efficient unseen-temporal segmentation.
We evaluate the proposed approach on DAVIS$_17$ and YouTube-VIS, and the results demonstrate that it outperforms state-of-the-art methods both in segmentation accuracy and inference speed.
arXiv Detail & Related papers (2021-04-10T14:39:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.