Image Segmentation in Foundation Model Era: A Survey
- URL: http://arxiv.org/abs/2408.12957v2
- Date: Tue, 29 Oct 2024 04:05:35 GMT
- Title: Image Segmentation in Foundation Model Era: A Survey
- Authors: Tianfei Zhou, Fei Zhang, Boyu Chang, Wenguan Wang, Ye Yuan, Ender Konukoglu, Daniel Cremers,
- Abstract summary: Current research in image segmentation lacks a detailed analysis of distinct characteristics, challenges, and solutions associated with these advancements.
This survey seeks to fill this gap by providing a thorough review of cutting-edge research centered around FM-driven image segmentation.
An exhaustive overview of over 300 segmentation approaches is provided to encapsulate the breadth of current research efforts.
- Score: 99.19456390358211
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Image segmentation is a long-standing challenge in computer vision, studied continuously over several decades, as evidenced by seminal algorithms such as N-Cut, FCN, and MaskFormer. With the advent of foundation models (FMs), contemporary segmentation methodologies have embarked on a new epoch by either adapting FMs (e.g., CLIP, Stable Diffusion, DINO) for image segmentation or developing dedicated segmentation foundation models (e.g., SAM). These approaches not only deliver superior segmentation performance, but also herald newfound segmentation capabilities previously unseen in deep learning context. However, current research in image segmentation lacks a detailed analysis of distinct characteristics, challenges, and solutions associated with these advancements. This survey seeks to fill this gap by providing a thorough review of cutting-edge research centered around FM-driven image segmentation. We investigate two basic lines of research -- generic image segmentation (i.e., semantic segmentation, instance segmentation, panoptic segmentation), and promptable image segmentation (i.e., interactive segmentation, referring segmentation, few-shot segmentation) -- by delineating their respective task settings, background concepts, and key challenges. Furthermore, we provide insights into the emergence of segmentation knowledge from FMs like CLIP, Stable Diffusion, and DINO. An exhaustive overview of over 300 segmentation approaches is provided to encapsulate the breadth of current research efforts. Subsequently, we engage in a discussion of open issues and potential avenues for future research. We envisage that this fresh, comprehensive, and systematic survey catalyzes the evolution of advanced image segmentation systems.
Related papers
- USE: Universal Segment Embeddings for Open-Vocabulary Image Segmentation [33.11010205890195]
The main challenge in open-vocabulary image segmentation now lies in accurately classifying these segments into text-defined categories.
We introduce the Universal Segment Embedding (USE) framework to address this challenge.
This framework is comprised of two key components: 1) a data pipeline designed to efficiently curate a large amount of segment-text pairs at various granularities, and 2) a universal segment embedding model that enables precise segment classification into a vast range of text-defined categories.
arXiv Detail & Related papers (2024-06-07T21:41:18Z) - Semi-Supervised Semantic Segmentation Based on Pseudo-Labels: A Survey [49.47197748663787]
This review aims to provide a first comprehensive and organized overview of the state-of-the-art research results on pseudo-label methods in the field of semi-supervised semantic segmentation.
In addition, we explore the application of pseudo-label technology in medical and remote-sensing image segmentation.
arXiv Detail & Related papers (2024-03-04T10:18:38Z) - Weakly-Supervised Semantic Segmentation with Image-Level Labels: from
Traditional Models to Foundation Models [33.690846523358836]
Weakly-supervised semantic segmentation (WSSS) is an effective solution to avoid pixel-level labels.
We focus on the WSSS with image-level labels, which is the most challenging form of WSSS.
We investigate the applicability of visual foundation models, such as the Segment Anything Model (SAM), in the context of WSSS.
arXiv Detail & Related papers (2023-10-19T07:16:54Z) - SamDSK: Combining Segment Anything Model with Domain-Specific Knowledge
for Semi-Supervised Learning in Medical Image Segmentation [27.044797468878837]
The Segment Anything Model (SAM) exhibits a capability to segment a wide array of objects in natural images.
We propose a novel method that combines the SAM with domain-specific knowledge for reliable utilization of unlabeled images.
Our work initiates a new direction of semi-supervised learning for medical image segmentation.
arXiv Detail & Related papers (2023-08-26T04:46:10Z) - AIMS: All-Inclusive Multi-Level Segmentation [93.5041381700744]
We propose a new task, All-Inclusive Multi-Level (AIMS), which segments visual regions into three levels: part, entity, and relation.
We also build a unified AIMS model through multi-dataset multi-task training to address the two major challenges of annotation inconsistency and task correlation.
arXiv Detail & Related papers (2023-05-28T16:28:49Z) - Semantic Image Segmentation: Two Decades of Research [22.533249554532322]
This book is an effort to summarize two decades of research in the field of semantic image segmentation (SiS)
We propose a review of solutions starting from early historical methods followed by an overview of more recent deep learning methods including the latest trend of using transformers.
We unveil newer trends such as multi-domain learning, domain generalization, domain incremental learning, test-time adaptation and source-free domain adaptation.
arXiv Detail & Related papers (2023-02-13T14:11:05Z) - Open-world Semantic Segmentation via Contrasting and Clustering
Vision-Language Embedding [95.78002228538841]
We propose a new open-world semantic segmentation pipeline that makes the first attempt to learn to segment semantic objects of various open-world categories without any efforts on dense annotations.
Our method can directly segment objects of arbitrary categories, outperforming zero-shot segmentation methods that require data labeling on three benchmark datasets.
arXiv Detail & Related papers (2022-07-18T09:20:04Z) - A Survey on Label-efficient Deep Segmentation: Bridging the Gap between
Weak Supervision and Dense Prediction [115.9169213834476]
This paper offers a comprehensive review on label-efficient segmentation methods.
We first develop a taxonomy to organize these methods according to the supervision provided by different types of weak labels.
Next, we summarize the existing label-efficient segmentation methods from a unified perspective.
arXiv Detail & Related papers (2022-07-04T06:21:01Z) - Panoptic Segmentation: A Review [2.270719568619559]
This paper presents the first comprehensive review of existing panoptic segmentation methods.
Panoptic segmentation is currently under study to help gain a more nuanced knowledge of the image scenes for video surveillance, crowd counting, self-autonomous driving, medical image analysis.
arXiv Detail & Related papers (2021-11-19T14:40:24Z) - A Few Guidelines for Incremental Few-Shot Segmentation [57.34237650765928]
Given a pretrained segmentation model and few images containing novel classes, our goal is to learn to segment novel classes while retaining the ability to segment previously seen ones.
We show how the main problems of end-to-end training in this scenario are.
i) the drift of the batch-normalization statistics toward novel classes that we can fix with batch renormalization and.
ii) the forgetting of old classes, that we can fix with regularization strategies.
arXiv Detail & Related papers (2020-11-30T20:45:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.