Zero-Shot Instance Segmentation
- URL: http://arxiv.org/abs/2104.06601v1
- Date: Wed, 14 Apr 2021 03:02:48 GMT
- Title: Zero-Shot Instance Segmentation
- Authors: Ye Zheng, Jiahong Wu, Yongqiang Qin, Faen Zhang, Li Cui
- Abstract summary: We propose a new task set named zero-shot instance segmentation (ZSI)
In the training phase, the model is trained with seen data, while in the testing phase, it is used to segment all seen and unseen instances.
We present a new benchmark for zero-shot instance segmentation based on the MS-COCO dataset.
- Score: 4.457471295379149
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning has significantly improved the precision of instance
segmentation with abundant labeled data. However, in many areas like medical
and manufacturing, collecting sufficient data is extremely hard and labeling
this data requires high professional skills. We follow this motivation and
propose a new task set named zero-shot instance segmentation (ZSI). In the
training phase of ZSI, the model is trained with seen data, while in the
testing phase, it is used to segment all seen and unseen instances. We first
formulate the ZSI task and propose a method to tackle the challenge, which
consists of Zero-shot Detector, Semantic Mask Head, Background Aware RPN and
Synchronized Background Strategy. We present a new benchmark for zero-shot
instance segmentation based on the MS-COCO dataset. The extensive empirical
results in this benchmark show that our method not only surpasses the
state-of-the-art results in zero-shot object detection task but also achieves
promising performance on ZSI. Our approach will serve as a solid baseline and
facilitate future research in zero-shot instance segmentation.
Related papers
- ConformalSAM: Unlocking the Potential of Foundational Segmentation Models in Semi-Supervised Semantic Segmentation with Conformal Prediction [57.930531826380836]
This work explores whether a foundational segmentation model can address label scarcity in the pixel-level vision task as an annotator for unlabeled images.<n>We propose ConformalSAM, a novel SSSS framework which first calibrates the foundation model using the target domain's labeled data and then filters out unreliable pixel labels of unlabeled data.
arXiv Detail & Related papers (2025-07-21T17:02:57Z) - Adapting Segment Anything Model for Unseen Object Instance Segmentation [70.60171342436092]
Unseen Object Instance (UOIS) is crucial for autonomous robots operating in unstructured environments.
We propose UOIS-SAM, a data-efficient solution for the UOIS task.
UOIS-SAM integrates two key components: (i) a Heatmap-based Prompt Generator (HPG) to generate class-agnostic point prompts with precise foreground prediction, and (ii) a Hierarchical Discrimination Network (HDNet) that adapts SAM's mask decoder.
arXiv Detail & Related papers (2024-09-23T19:05:50Z) - Generalized Few-Shot Semantic Segmentation in Remote Sensing: Challenge and Benchmark [18.636210870172675]
Few-shot semantic segmentation can encourage deep learning models to learn from few labelled examples for novel classes not seen during the training.
The generalized few-shot segmentation setting has an additional challenge which encourages models not only to adapt to the novel classes but also to maintain strong performance on the training base classes.
We release the dataset augmenting OpenEarthMap with additional classes labelled for the generalized few-shot evaluation setting.
arXiv Detail & Related papers (2024-09-17T14:20:47Z) - Exploring Open-Vocabulary Semantic Segmentation without Human Labels [76.15862573035565]
We present ZeroSeg, a novel method that leverages the existing pretrained vision-language model (VL) to train semantic segmentation models.
ZeroSeg overcomes this by distilling the visual concepts learned by VL models into a set of segment tokens, each summarizing a localized region of the target image.
Our approach achieves state-of-the-art performance when compared to other zero-shot segmentation methods under the same training data.
arXiv Detail & Related papers (2023-06-01T08:47:06Z) - ElC-OIS: Ellipsoidal Clustering for Open-World Instance Segmentation on
LiDAR Data [13.978966783993146]
Open-world Instance (OIS) is a challenging task that aims to accurately segment every object instance appearing in the current observation.
This is important for safety-critical applications such as robust autonomous navigation.
We present a flexible and effective OIS framework for LiDAR point cloud that can accurately segment both known and unknown instances.
arXiv Detail & Related papers (2023-03-08T03:22:11Z) - Active Pointly-Supervised Instance Segmentation [106.38955769817747]
We present an economic active learning setting, named active pointly-supervised instance segmentation (APIS)
APIS starts with box-level annotations and iteratively samples a point within the box and asks if it falls on the object.
The model developed with these strategies yields consistent performance gain on the challenging MS-COCO dataset.
arXiv Detail & Related papers (2022-07-23T11:25:24Z) - Novel Class Discovery in Semantic Segmentation [104.30729847367104]
We introduce a new setting of Novel Class Discovery in Semantic (NCDSS)
It aims at segmenting unlabeled images containing new classes given prior knowledge from a labeled set of disjoint classes.
In NCDSS, we need to distinguish the objects and background, and to handle the existence of multiple classes within an image.
We propose the Entropy-based Uncertainty Modeling and Self-training (EUMS) framework to overcome noisy pseudo-labels.
arXiv Detail & Related papers (2021-12-03T13:31:59Z) - Large-scale Unsupervised Semantic Segmentation [163.3568726730319]
We propose a new problem of large-scale unsupervised semantic segmentation (LUSS) with a newly created benchmark dataset to track the research progress.
Based on the ImageNet dataset, we propose the ImageNet-S dataset with 1.2 million training images and 40k high-quality semantic segmentation annotations for evaluation.
arXiv Detail & Related papers (2021-06-06T15:02:11Z) - The Devil is in Classification: A Simple Framework for Long-tail Object
Detection and Instance Segmentation [93.17367076148348]
We investigate performance drop of the state-of-the-art two-stage instance segmentation model Mask R-CNN on the recent long-tail LVIS dataset.
We unveil that a major cause is the inaccurate classification of object proposals.
We propose a simple calibration framework to more effectively alleviate classification head bias with a bi-level class balanced sampling approach.
arXiv Detail & Related papers (2020-07-23T12:49:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.