ContrastMask: Contrastive Learning to Segment Every Thing
- URL: http://arxiv.org/abs/2203.09775v1
- Date: Fri, 18 Mar 2022 07:41:48 GMT
- Title: ContrastMask: Contrastive Learning to Segment Every Thing
- Authors: Xuehui Wang, Kai Zhao, Ruixin Zhang, Shouhong Ding, Yan Wang, Wei Shen
- Abstract summary: We propose ContrastMask, which learns a mask segmentation model on both seen and unseen categories.
Features from the mask regions (foreground) are pulled together, and are contrasted against those from the background, and vice versa.
Exhaustive experiments on the COCO dataset demonstrate the superiority of our method.
- Score: 18.265503138997794
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Partially-supervised instance segmentation is a task which requests
segmenting objects from novel unseen categories via learning on limited seen
categories with annotated masks thus eliminating demands of heavy annotation
burden. The key to addressing this task is to build an effective class-agnostic
mask segmentation model. Unlike previous methods that learn such models only on
seen categories, in this paper, we propose a new method, named ContrastMask,
which learns a mask segmentation model on both seen and unseen categories under
a unified pixel-level contrastive learning framework. In this framework,
annotated masks of seen categories and pseudo masks of unseen categories serve
as a prior for contrastive learning, where features from the mask regions
(foreground) are pulled together, and are contrasted against those from the
background, and vice versa. Through this framework, feature discrimination
between foreground and background is largely improved, facilitating learning of
the class-agnostic mask segmentation model. Exhaustive experiments on the COCO
dataset demonstrate the superiority of our method, which outperforms previous
state-of-the-arts.
Related papers
- Masked Image Modeling Boosting Semi-Supervised Semantic Segmentation [38.55611683982936]
We introduce a novel class-wise masked image modeling that independently reconstructs different image regions according to their respective classes.
We develop a feature aggregation strategy that minimizes the distances between features corresponding to the masked and visible parts within the same class.
In semantic space, we explore the application of masked image modeling to enhance regularization.
arXiv Detail & Related papers (2024-11-13T16:42:07Z) - Mask-free OVIS: Open-Vocabulary Instance Segmentation without Manual
Mask Annotations [86.47908754383198]
Open-Vocabulary (OV) methods leverage large-scale image-caption pairs and vision-language models to learn novel categories.
Our method generates pseudo-mask annotations by leveraging the localization ability of a pre-trained vision-language model for objects present in image-caption pairs.
Our method trained with just pseudo-masks significantly improves the mAP scores on the MS-COCO dataset and OpenImages dataset.
arXiv Detail & Related papers (2023-03-29T17:58:39Z) - Discovering Object Masks with Transformers for Unsupervised Semantic
Segmentation [75.00151934315967]
MaskDistill is a novel framework for unsupervised semantic segmentation.
Our framework does not latch onto low-level image cues and is not limited to object-centric datasets.
arXiv Detail & Related papers (2022-06-13T17:59:43Z) - What You See is What You Classify: Black Box Attributions [61.998683569022006]
We train a deep network, the Explainer, to predict attributions for a pre-trained black-box classifier, the Explanandum.
Unlike most existing approaches, ours is capable of directly generating very distinct class-specific masks.
We show that our attributions are superior to established methods both visually and quantitatively.
arXiv Detail & Related papers (2022-05-23T12:30:04Z) - Per-Pixel Classification is Not All You Need for Semantic Segmentation [184.2905747595058]
Mask classification is sufficiently general to solve both semantic- and instance-level segmentation tasks.
We propose MaskFormer, a simple mask classification model which predicts a set of binary masks.
Our method outperforms both current state-of-the-art semantic (55.6 mIoU on ADE20K) and panoptic segmentation (52.7 PQ on COCO) models.
arXiv Detail & Related papers (2021-07-13T17:59:50Z) - Investigating and Simplifying Masking-based Saliency Methods for Model
Interpretability [5.387323728379395]
Saliency maps that identify the most informative regions of an image are valuable for model interpretability.
A common approach to creating saliency maps involves generating input masks that mask out portions of an image.
We show that a masking model can be trained with as few as 10 examples per class and still generate saliency maps with only a 0.7-point increase in localization error.
arXiv Detail & Related papers (2020-10-19T18:00:36Z) - Causal Intervention for Weakly-Supervised Semantic Segmentation [122.1846968696862]
We aim to generate better pixel-level pseudo-masks by using only image-level labels.
We propose a structural causal model to analyze the causalities among images, contexts, and class labels.
Based on it, we develop a new method: Context Adjustment (CONTA), to remove the confounding bias in image-level classification.
arXiv Detail & Related papers (2020-09-26T09:26:29Z) - Commonality-Parsing Network across Shape and Appearance for Partially
Supervised Instance Segmentation [71.59275788106622]
We propose to learn the underlying class-agnostic commonalities that can be generalized from mask-annotated categories to novel categories.
Our model significantly outperforms the state-of-the-art methods on both partially supervised setting and few-shot setting for instance segmentation on COCO dataset.
arXiv Detail & Related papers (2020-07-24T07:23:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.