The surprising impact of mask-head architecture on novel class
segmentation
- URL: http://arxiv.org/abs/2104.00613v1
- Date: Thu, 1 Apr 2021 16:46:37 GMT
- Title: The surprising impact of mask-head architecture on novel class
segmentation
- Authors: Vighnesh Birodkar, Zhichao Lu, Siyang Li, Vivek Rathod, Jonathan Huang
- Abstract summary: We show that the architecture of the mask-head plays a surprisingly important role in generalization to classes for which we do not observe masks during training.
We also show that the choice of mask-head architecture alone can lead to SOTA results on the partially supervised COCO benchmark without the need of specialty modules or losses proposed by prior literature.
- Score: 27.076315496682444
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Instance segmentation models today are very accurate when trained on large
annotated datasets, but collecting mask annotations at scale is prohibitively
expensive. We address the partially supervised instance segmentation problem in
which one can train on (significantly cheaper) bounding boxes for all
categories but use masks only for a subset of categories. In this work, we
focus on a popular family of models which apply differentiable cropping to a
feature map and predict a mask based on the resulting crop. Within this family,
we show that the architecture of the mask-head plays a surprisingly important
role in generalization to classes for which we do not observe masks during
training. While many architectures perform similarly when trained in fully
supervised mode, we show that they often generalize to novel classes in
dramatically different ways. We call this phenomenon the strong mask
generalization effect, which we exploit by replacing the typical mask-head of
2-4 layers with significantly deeper off-the-shelf architectures (e.g. ResNet,
Hourglass models). We also show that the choice of mask-head architecture alone
can lead to SOTA results on the partially supervised COCO benchmark without the
need of specialty modules or losses proposed by prior literature. Finally, we
demonstrate that our effect is general, holding across underlying detection
methodologies, (e.g. both anchor-based or anchor free or no detector at all)
and across different backbone networks. Code and pre-trained models are
available at https://git.io/deepmac.
Related papers
- MaskUno: Switch-Split Block For Enhancing Instance Segmentation [0.0]
We propose replacing mask prediction with a Switch-Split block that processes refined ROIs, classifies them, and assigns them to specialized mask predictors.
An increase in the mean Average Precision (mAP) of 2.03% was observed for the high-performing DetectoRS when trained on 80 classes.
arXiv Detail & Related papers (2024-07-31T10:12:14Z) - ColorMAE: Exploring data-independent masking strategies in Masked AutoEncoders [53.3185750528969]
Masked AutoEncoders (MAE) have emerged as a robust self-supervised framework.
We introduce a data-independent method, termed ColorMAE, which generates different binary mask patterns by filtering random noise.
We demonstrate our strategy's superiority in downstream tasks compared to random masking.
arXiv Detail & Related papers (2024-07-17T22:04:00Z) - Mask-free OVIS: Open-Vocabulary Instance Segmentation without Manual
Mask Annotations [86.47908754383198]
Open-Vocabulary (OV) methods leverage large-scale image-caption pairs and vision-language models to learn novel categories.
Our method generates pseudo-mask annotations by leveraging the localization ability of a pre-trained vision-language model for objects present in image-caption pairs.
Our method trained with just pseudo-masks significantly improves the mAP scores on the MS-COCO dataset and OpenImages dataset.
arXiv Detail & Related papers (2023-03-29T17:58:39Z) - Towards Improved Input Masking for Convolutional Neural Networks [66.99060157800403]
We propose a new masking method for CNNs we call layer masking.
We show that our method is able to eliminate or minimize the influence of the mask shape or color on the output of the model.
We also demonstrate how the shape of the mask may leak information about the class, thus affecting estimates of model reliance on class-relevant features.
arXiv Detail & Related papers (2022-11-26T19:31:49Z) - What You See is What You Classify: Black Box Attributions [61.998683569022006]
We train a deep network, the Explainer, to predict attributions for a pre-trained black-box classifier, the Explanandum.
Unlike most existing approaches, ours is capable of directly generating very distinct class-specific masks.
We show that our attributions are superior to established methods both visually and quantitatively.
arXiv Detail & Related papers (2022-05-23T12:30:04Z) - Gait Recognition with Mask-based Regularization [31.901166591272464]
We propose a novel mask-based regularization method named ReverseMask.
By injecting on the feature map, the proposed regularization method helps convolutional architecture learn the discriminative representations.
The plug-and-play Inception-like ReverseMask block is simple and effective to generalize networks.
arXiv Detail & Related papers (2022-03-08T12:13:29Z) - Scaling up instance annotation via label propagation [69.8001043244044]
We propose a highly efficient annotation scheme for building large datasets with object segmentation masks.
We exploit these similarities by using hierarchical clustering on mask predictions made by a segmentation model.
We show that we obtain 1M object segmentation masks with a total annotation time of only 290 hours.
arXiv Detail & Related papers (2021-10-05T18:29:34Z) - Per-Pixel Classification is Not All You Need for Semantic Segmentation [184.2905747595058]
Mask classification is sufficiently general to solve both semantic- and instance-level segmentation tasks.
We propose MaskFormer, a simple mask classification model which predicts a set of binary masks.
Our method outperforms both current state-of-the-art semantic (55.6 mIoU on ADE20K) and panoptic segmentation (52.7 PQ on COCO) models.
arXiv Detail & Related papers (2021-07-13T17:59:50Z) - Prior to Segment: Foreground Cues for Weakly Annotated Classes in
Partially Supervised Instance Segmentation [3.192503074844774]
Partially supervised instance segmentation aims to improve mask prediction with limited mask labels by utilizing the more abundant weak box labels.
We show that a class agnostic mask head, commonly used in partially supervised instance segmentation, has difficulties learning a general concept of foreground for the weakly annotated classes.
We introduce an object mask prior (OMP) that provides the mask head with the general concept of foreground implicitly learned by the box classification head under the supervision of all classes.
arXiv Detail & Related papers (2020-11-23T23:15:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.