Semantic-Promoted Debiasing and Background Disambiguation for Zero-Shot
Instance Segmentation
- URL: http://arxiv.org/abs/2305.13173v1
- Date: Mon, 22 May 2023 16:00:01 GMT
- Title: Semantic-Promoted Debiasing and Background Disambiguation for Zero-Shot
Instance Segmentation
- Authors: Shuting He, Henghui Ding, Wei Jiang
- Abstract summary: Zero-shot instance segmentation aims to detect and precisely segment objects of unseen categories without any training samples.
We propose D$2$Zero with Semantic-Promoted Debiasing and Background Disambiguation.
Background disambiguation produces image-adaptive background representation to avoid mistaking novel objects for background.
- Score: 13.001629605405954
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Zero-shot instance segmentation aims to detect and precisely segment objects
of unseen categories without any training samples. Since the model is trained
on seen categories, there is a strong bias that the model tends to classify all
the objects into seen categories. Besides, there is a natural confusion between
background and novel objects that have never shown up in training. These two
challenges make novel objects hard to be raised in the final instance
segmentation results. It is desired to rescue novel objects from background and
dominated seen categories. To this end, we propose D$^2$Zero with
Semantic-Promoted Debiasing and Background Disambiguation to enhance the
performance of Zero-shot instance segmentation. Semantic-promoted debiasing
utilizes inter-class semantic relationships to involve unseen categories in
visual feature training and learns an input-conditional classifier to conduct
dynamical classification based on the input image. Background disambiguation
produces image-adaptive background representation to avoid mistaking novel
objects for background. Extensive experiments show that we significantly
outperform previous state-of-the-art methods by a large margin, e.g., 16.86%
improvement on COCO. Project page: https://henghuiding.github.io/D2Zero/
Related papers
- PDiscoNet: Semantically consistent part discovery for fine-grained
recognition [62.12602920807109]
We propose PDiscoNet to discover object parts by using only image-level class labels along with priors encouraging the parts to be.
Our results on CUB, CelebA, and PartImageNet show that the proposed method provides substantially better part discovery performance than previous methods.
arXiv Detail & Related papers (2023-09-06T17:19:29Z) - Reflection Invariance Learning for Few-shot Semantic Segmentation [53.20466630330429]
Few-shot semantic segmentation (FSS) aims to segment objects of unseen classes in query images with only a few annotated support images.
This paper proposes a fresh few-shot segmentation framework to mine the reflection invariance in a multi-view matching manner.
Experiments on both PASCAL-$5textiti$ and COCO-$20textiti$ datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-06-01T15:14:58Z) - RaSP: Relation-aware Semantic Prior for Weakly Supervised Incremental
Segmentation [28.02204928717511]
We propose a weakly supervised approach to transfer objectness prior from the previously learned classes into the new ones.
We show how even a simple pairwise interaction between classes can significantly improve the segmentation mask quality of both old and new classes.
arXiv Detail & Related papers (2023-05-31T14:14:21Z) - Class Enhancement Losses with Pseudo Labels for Zero-shot Semantic
Segmentation [40.09476732999614]
Mask proposal models have significantly improved the performance of zero-shot semantic segmentation.
The use of a background' embedding during training in these methods is problematic as the resulting model tends to over-learn and assign all unseen classes as the background class instead of their correct labels.
This paper proposes novel class enhancement losses to bypass the use of the background embbedding during training, and simultaneously exploit the semantic relationship between text embeddings and mask proposals by ranking the similarity scores.
arXiv Detail & Related papers (2023-01-18T06:55:02Z) - Foreground-Background Separation through Concept Distillation from
Generative Image Foundation Models [6.408114351192012]
We present a novel method that enables the generation of general foreground-background segmentation models from simple textual descriptions.
We show results on the task of segmenting four different objects (humans, dogs, cars, birds) and a use case scenario in medical image analysis.
arXiv Detail & Related papers (2022-12-29T13:51:54Z) - Learning to Detect Every Thing in an Open World [139.78830329914135]
We propose a simple yet surprisingly powerful data augmentation and training scheme we call Learning to Detect Every Thing (LDET)
To avoid suppressing hidden objects, background objects that are visible but unlabeled, we paste annotated objects on a background image sampled from a small region of the original image.
LDET leads to significant improvements on many datasets in the open world instance segmentation task.
arXiv Detail & Related papers (2021-12-03T03:56:06Z) - Explicitly Modeling the Discriminability for Instance-Aware Visual
Object Tracking [13.311777431243296]
We propose a novel Instance-Aware Tracker (IAT) to excavate the discriminability of feature representations.
We implement two variants of the proposed IAT, including a video-level one and an object-level one.
Both versions achieve leading results against state-of-the-art methods while running at 30FPS.
arXiv Detail & Related papers (2021-10-28T11:24:01Z) - Rectifying the Shortcut Learning of Background: Shared Object
Concentration for Few-Shot Image Recognition [101.59989523028264]
Few-Shot image classification aims to utilize pretrained knowledge learned from a large-scale dataset to tackle a series of downstream classification tasks.
We propose COSOC, a novel Few-Shot Learning framework, to automatically figure out foreground objects at both pretraining and evaluation stage.
arXiv Detail & Related papers (2021-07-16T07:46:41Z) - Closing the Generalization Gap in One-Shot Object Detection [92.82028853413516]
We show that the key to strong few-shot detection models may not lie in sophisticated metric learning approaches, but instead in scaling the number of categories.
Future data annotation efforts should therefore focus on wider datasets and annotate a larger number of categories.
arXiv Detail & Related papers (2020-11-09T09:31:17Z) - Synthesizing the Unseen for Zero-shot Object Detection [72.38031440014463]
We propose to synthesize visual features for unseen classes, so that the model learns both seen and unseen objects in the visual domain.
We use a novel generative model that uses class-semantics to not only generate the features but also to discriminatively separate them.
arXiv Detail & Related papers (2020-10-19T12:36:11Z) - Demystifying Contrastive Self-Supervised Learning: Invariances,
Augmentations and Dataset Biases [34.02639091680309]
Recent gains in performance come from training instance classification models, treating each image and it's augmented versions as samples of a single class.
We demonstrate that approaches like MOCO and PIRL learn occlusion-invariant representations.
Second, we demonstrate that these approaches obtain further gains from access to a clean object-centric training dataset like Imagenet.
arXiv Detail & Related papers (2020-07-28T00:11:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.