Open-World Entity Segmentation
- URL: http://arxiv.org/abs/2107.14228v1
- Date: Thu, 29 Jul 2021 17:59:05 GMT
- Title: Open-World Entity Segmentation
- Authors: Lu Qi, Jason Kuen, Yi Wang, Jiuxiang Gu, Hengshuang Zhao, Zhe Lin,
Philip Torr, Jiaya Jia
- Abstract summary: We introduce a new image segmentation task, termed Entity (ES) with the aim to segment all visual entities in an image without considering semantic category labels.
All semantically-meaningful segments are equally treated as categoryless entities and there is no thing-stuff distinction.
ES enables the following: (1) merging multiple datasets to form a large training set without the need to resolve label conflicts; (2) any model trained on one dataset can generalize exceptionally well to other datasets with unseen domains.
- Score: 70.41548013910402
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We introduce a new image segmentation task, termed Entity Segmentation (ES)
with the aim to segment all visual entities in an image without considering
semantic category labels. It has many practical applications in image
manipulation/editing where the segmentation mask quality is typically crucial
but category labels are less important. In this setting, all
semantically-meaningful segments are equally treated as categoryless entities
and there is no thing-stuff distinction. Based on our unified entity
representation, we propose a center-based entity segmentation framework with
two novel modules to improve mask quality. Experimentally, both our new task
and framework demonstrate superior advantages as against existing work. In
particular, ES enables the following: (1) merging multiple datasets to form a
large training set without the need to resolve label conflicts; (2) any model
trained on one dataset can generalize exceptionally well to other datasets with
unseen domains. Our code is made publicly available at
https://github.com/dvlab-research/Entity.
Related papers
- USE: Universal Segment Embeddings for Open-Vocabulary Image Segmentation [33.11010205890195]
The main challenge in open-vocabulary image segmentation now lies in accurately classifying these segments into text-defined categories.
We introduce the Universal Segment Embedding (USE) framework to address this challenge.
This framework is comprised of two key components: 1) a data pipeline designed to efficiently curate a large amount of segment-text pairs at various granularities, and 2) a universal segment embedding model that enables precise segment classification into a vast range of text-defined categories.
arXiv Detail & Related papers (2024-06-07T21:41:18Z) - Unsupervised Universal Image Segmentation [59.0383635597103]
We propose an Unsupervised Universal model (U2Seg) adept at performing various image segmentation tasks.
U2Seg generates pseudo semantic labels for these segmentation tasks via leveraging self-supervised models.
We then self-train the model on these pseudo semantic labels, yielding substantial performance gains.
arXiv Detail & Related papers (2023-12-28T18:59:04Z) - ISLE: A Framework for Image Level Semantic Segmentation Ensemble [5.137284292672375]
Conventional semantic segmentation networks require massive pixel-wise annotated labels to reach state-of-the-art prediction quality.
We propose ISLE, which employs an ensemble of the "pseudo-labels" for a given set of different semantic segmentation techniques on a class-wise level.
We reach up to 2.4% improvement over ISLE's individual components.
arXiv Detail & Related papers (2023-03-14T13:36:36Z) - Open-world Semantic Segmentation via Contrasting and Clustering
Vision-Language Embedding [95.78002228538841]
We propose a new open-world semantic segmentation pipeline that makes the first attempt to learn to segment semantic objects of various open-world categories without any efforts on dense annotations.
Our method can directly segment objects of arbitrary categories, outperforming zero-shot segmentation methods that require data labeling on three benchmark datasets.
arXiv Detail & Related papers (2022-07-18T09:20:04Z) - Scaling up Multi-domain Semantic Segmentation with Sentence Embeddings [81.09026586111811]
We propose an approach to semantic segmentation that achieves state-of-the-art supervised performance when applied in a zero-shot setting.
This is achieved by replacing each class label with a vector-valued embedding of a short paragraph that describes the class.
The resulting merged semantic segmentation dataset of over 2 Million images enables training a model that achieves performance equal to that of state-of-the-art supervised methods on 7 benchmark datasets.
arXiv Detail & Related papers (2022-02-04T07:19:09Z) - Segmenter: Transformer for Semantic Segmentation [79.9887988699159]
We introduce Segmenter, a transformer model for semantic segmentation.
We build on the recent Vision Transformer (ViT) and extend it to semantic segmentation.
It outperforms the state of the art on the challenging ADE20K dataset and performs on-par on Pascal Context and Cityscapes.
arXiv Detail & Related papers (2021-05-12T13:01:44Z) - Semantic Segmentation with Generative Models: Semi-Supervised Learning
and Strong Out-of-Domain Generalization [112.68171734288237]
We propose a novel framework for discriminative pixel-level tasks using a generative model of both images and labels.
We learn a generative adversarial network that captures the joint image-label distribution and is trained efficiently using a large set of unlabeled images.
We demonstrate strong in-domain performance compared to several baselines, and are the first to showcase extreme out-of-domain generalization.
arXiv Detail & Related papers (2021-04-12T21:41:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.