SegGPT: Segmenting Everything In Context
- URL: http://arxiv.org/abs/2304.03284v1
- Date: Thu, 6 Apr 2023 17:59:57 GMT
- Title: SegGPT: Segmenting Everything In Context
- Authors: Xinlong Wang, Xiaosong Zhang, Yue Cao, Wen Wang, Chunhua Shen, Tiejun
Huang
- Abstract summary: We present SegGPT, a model for segmenting everything in context.
We unify various segmentation tasks into a generalist in-context learning framework.
SegGPT can perform arbitrary segmentation tasks in images or videos via in-context inference.
- Score: 98.98487097934067
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present SegGPT, a generalist model for segmenting everything in context.
We unify various segmentation tasks into a generalist in-context learning
framework that accommodates different kinds of segmentation data by
transforming them into the same format of images. The training of SegGPT is
formulated as an in-context coloring problem with random color mapping for each
data sample. The objective is to accomplish diverse tasks according to the
context, rather than relying on specific colors. After training, SegGPT can
perform arbitrary segmentation tasks in images or videos via in-context
inference, such as object instance, stuff, part, contour, and text. SegGPT is
evaluated on a broad range of tasks, including few-shot semantic segmentation,
video object segmentation, semantic segmentation, and panoptic segmentation.
Our results show strong capabilities in segmenting in-domain and out-of-domain
targets, either qualitatively or quantitatively.
Related papers
- USE: Universal Segment Embeddings for Open-Vocabulary Image Segmentation [33.11010205890195]
The main challenge in open-vocabulary image segmentation now lies in accurately classifying these segments into text-defined categories.
We introduce the Universal Segment Embedding (USE) framework to address this challenge.
This framework is comprised of two key components: 1) a data pipeline designed to efficiently curate a large amount of segment-text pairs at various granularities, and 2) a universal segment embedding model that enables precise segment classification into a vast range of text-defined categories.
arXiv Detail & Related papers (2024-06-07T21:41:18Z) - Unsupervised Universal Image Segmentation [59.0383635597103]
We propose an Unsupervised Universal model (U2Seg) adept at performing various image segmentation tasks.
U2Seg generates pseudo semantic labels for these segmentation tasks via leveraging self-supervised models.
We then self-train the model on these pseudo semantic labels, yielding substantial performance gains.
arXiv Detail & Related papers (2023-12-28T18:59:04Z) - SAMBA: A Trainable Segmentation Web-App with Smart Labelling [0.0]
SAMBA is a trainable segmentation tool that uses Meta's Segment Anything Model (SAM) for fast, high-quality label suggestions.
The segmentation backend is run in the cloud, so does not require the user to have powerful hardware.
arXiv Detail & Related papers (2023-12-07T10:31:05Z) - SEGIC: Unleashing the Emergent Correspondence for In-Context Segmentation [87.18373801829314]
In-context segmentation aims at segmenting novel images using a few labeled example images, termed as "in-context examples"
We propose SEGIC, an end-to-end segment-in-context framework built upon a single vision foundation model (VFM)
SEGIC is a straightforward yet effective approach that yields state-of-the-art performance on one-shot segmentation benchmarks.
arXiv Detail & Related papers (2023-11-24T18:59:42Z) - Hierarchical Open-vocabulary Universal Image Segmentation [48.008887320870244]
Open-vocabulary image segmentation aims to partition an image into semantic regions according to arbitrary text descriptions.
We propose a decoupled text-image fusion mechanism and representation learning modules for both "things" and "stuff"
Our resulting model, named HIPIE tackles, HIerarchical, oPen-vocabulary, and unIvErsal segmentation tasks within a unified framework.
arXiv Detail & Related papers (2023-07-03T06:02:15Z) - Segment Everything Everywhere All at Once [124.90835636901096]
We present SEEM, a promptable and interactive model for segmenting everything everywhere all at once in an image.
We propose a novel decoding mechanism that enables diverse prompting for all types of segmentation tasks.
We conduct a comprehensive empirical study to validate the effectiveness of SEEM across diverse segmentation tasks.
arXiv Detail & Related papers (2023-04-13T17:59:40Z) - Structured Summarization: Unified Text Segmentation and Segment Labeling
as a Generation Task [16.155438404910043]
We propose a single encoder-decoder neural network that can handle long documents and conversations.
We successfully show a way to solve the combined task as a pure generation task.
Our results establish a strong case for considering text segmentation and segment labeling as a whole.
arXiv Detail & Related papers (2022-09-28T01:08:50Z) - Open-world Semantic Segmentation via Contrasting and Clustering
Vision-Language Embedding [95.78002228538841]
We propose a new open-world semantic segmentation pipeline that makes the first attempt to learn to segment semantic objects of various open-world categories without any efforts on dense annotations.
Our method can directly segment objects of arbitrary categories, outperforming zero-shot segmentation methods that require data labeling on three benchmark datasets.
arXiv Detail & Related papers (2022-07-18T09:20:04Z) - Learning Panoptic Segmentation from Instance Contours [9.347742071428918]
Panopticpixel aims to provide an understanding of background (stuff) and instances of objects (things) at a pixel level.
It combines the separate tasks of semantic segmentation (level classification) and instance segmentation to build a single unified scene understanding task.
We present a fully convolution neural network that learns instance segmentation from semantic segmentation and instance contours.
arXiv Detail & Related papers (2020-10-16T03:05:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.