Related papers: Customizing Segmentation Foundation Model via Prompt Learning for Instance Segmentation

Customizing Segmentation Foundation Model via Prompt Learning for Instance Segmentation

URL: http://arxiv.org/abs/2403.09199v1
Date: Thu, 14 Mar 2024 09:13:51 GMT
Title: Customizing Segmentation Foundation Model via Prompt Learning for Instance Segmentation
Authors: Hyung-Il Kim, Kimin Yun, Jun-Seok Yun, Yuseok Bae,
Abstract summary: The Segment Anything Model (SAM) stands out for its remarkable progress in generalizability and flexibility for image segmentation tasks. We propose a novel method, customized instance segmentation via prompt learning tailored to SAM. Our method involves a prompt learning module (PLM), which adjusts input prompts into the embedding space to better align with user intentions.
Score: 7.6136466242670435
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recently, foundation models trained on massive datasets to adapt to a wide range of domains have attracted considerable attention and are actively being explored within the computer vision community. Among these, the Segment Anything Model (SAM) stands out for its remarkable progress in generalizability and flexibility for image segmentation tasks, achieved through prompt-based object mask generation. However, despite its strength, SAM faces two key limitations when applied to customized instance segmentation that segments specific objects or those in unique environments not typically present in the training data: 1) the ambiguity inherent in input prompts and 2) the necessity for extensive additional training to achieve optimal segmentation. To address these challenges, we propose a novel method, customized instance segmentation via prompt learning tailored to SAM. Our method involves a prompt learning module (PLM), which adjusts input prompts into the embedding space to better align with user intentions, thereby enabling more efficient training. Furthermore, we introduce a point matching module (PMM) to enhance the feature representation for finer segmentation by ensuring detailed alignment with ground truth boundaries. Experimental results on various customized instance segmentation scenarios demonstrate the effectiveness of the proposed method.

Related papers

Segment Concealed Objects with Incomplete Supervision [63.637733655439334]
Incompletely-Supervised Concealed Object (ISCOS) involves segmenting objects that seamlessly blend into their surrounding environments.<n>This task remains highly challenging due to the limited supervision provided by the incompletely annotated training data.<n>In this paper, we introduce the first unified method for ISCOS to address these challenges.
arXiv Detail & Related papers (2025-06-10T16:25:15Z)
One-shot In-context Part Segmentation [97.77292483684877]
We present the One-shot In-context Part (OIParts) framework to tackle the challenges of part segmentation. Our framework offers a novel approach to part segmentation that is training-free, flexible, and data-efficient. We have achieved remarkable segmentation performance across diverse object categories.
arXiv Detail & Related papers (2025-03-03T03:50:54Z)
Task Consistent Prototype Learning for Incremental Few-shot Semantic Segmentation [20.49085411104439]
Incremental Few-Shot Semantic (iFSS) tackles a task that requires a model to continually expand its segmentation capability on novel classes. This study introduces a meta-learning-based prototype approach that encourages the model to learn how to adapt quickly while preserving previous knowledge. Experiments on iFSS datasets built upon PASCAL and COCO benchmarks show the advanced performance of the proposed approach.
arXiv Detail & Related papers (2024-10-16T23:42:27Z)
Adapting Segment Anything Model for Unseen Object Instance Segmentation [70.60171342436092]
Unseen Object Instance (UOIS) is crucial for autonomous robots operating in unstructured environments. We propose UOIS-SAM, a data-efficient solution for the UOIS task. UOIS-SAM integrates two key components: (i) a Heatmap-based Prompt Generator (HPG) to generate class-agnostic point prompts with precise foreground prediction, and (ii) a Hierarchical Discrimination Network (HDNet) that adapts SAM's mask decoder.
arXiv Detail & Related papers (2024-09-23T19:05:50Z)
AlignSAM: Aligning Segment Anything Model to Open Context via Reinforcement Learning [61.666973416903005]
Segment Anything Model (SAM) has demonstrated its impressive generalization capabilities in open-world scenarios with the guidance of prompts. We propose a novel framework, termed AlignSAM, designed for automatic prompting for aligning SAM to an open context.
arXiv Detail & Related papers (2024-06-01T16:21:39Z)
BLO-SAM: Bi-level Optimization Based Overfitting-Preventing Finetuning of SAM [37.1263294647351]
We introduce BLO-SAM, which finetunes the Segment Anything Model (SAM) based on bi-level optimization (BLO) BLO-SAM reduces the risk of overfitting by training the model's weight parameters and the prompt embedding on two separate subsets of the training dataset. Results demonstrate BLO-SAM's superior performance over various state-of-the-art image semantic segmentation methods.
arXiv Detail & Related papers (2024-02-26T06:36:32Z)
Universal Segmentation at Arbitrary Granularity with Language Instruction [59.76130089644841]
We present UniLSeg, a universal segmentation model that can perform segmentation at any semantic level with the guidance of language instructions. For training UniLSeg, we reorganize a group of tasks from original diverse distributions into a unified data format, where images with texts describing segmentation targets as input and corresponding masks are output.
arXiv Detail & Related papers (2023-12-04T04:47:48Z)
AIMS: All-Inclusive Multi-Level Segmentation [93.5041381700744]
We propose a new task, All-Inclusive Multi-Level (AIMS), which segments visual regions into three levels: part, entity, and relation. We also build a unified AIMS model through multi-dataset multi-task training to address the two major challenges of annotation inconsistency and task correlation.
arXiv Detail & Related papers (2023-05-28T16:28:49Z)
Weakly-Supervised Concealed Object Segmentation with SAM-based Pseudo Labeling and Multi-scale Feature Grouping [40.07070188661184]
Weakly-Supervised Concealed Object (WSCOS) aims to segment objects well blended with surrounding environments. It is hard to distinguish concealed objects from the background due to the intrinsic similarity. We propose a new WSCOS method to address these two challenges.
arXiv Detail & Related papers (2023-05-18T14:31:34Z)
USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text Retrieval [115.28586222748478]
Image-Text Retrieval (ITR) aims at searching for the target instances that are semantically relevant to the given query from the other modality. Existing approaches typically suffer from two major limitations.
arXiv Detail & Related papers (2023-01-17T12:42:58Z)
Learning to Relate Depth and Semantics for Unsupervised Domain Adaptation [87.1188556802942]
We present an approach for encoding visual task relationships to improve model performance in an Unsupervised Domain Adaptation (UDA) setting. We propose a novel Cross-Task Relation Layer (CTRL), which encodes task dependencies between the semantic and depth predictions. Furthermore, we propose an Iterative Self-Learning (ISL) training scheme, which exploits semantic pseudo-labels to provide extra supervision on the target domain.
arXiv Detail & Related papers (2021-05-17T13:42:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.