Fully Convolutional Networks for Automatically Generating Image Masks to
Train Mask R-CNN
- URL: http://arxiv.org/abs/2003.01383v2
- Date: Thu, 20 May 2021 06:53:45 GMT
- Title: Fully Convolutional Networks for Automatically Generating Image Masks to
Train Mask R-CNN
- Authors: Hao Wu, Jan Paul Siebert and Xiangrong Xu
- Abstract summary: The Mask R-CNN method achieves the best results in object detection until now, however, it is very time-consuming and laborious to get the object Masks for training.
This paper proposes a novel automatically generating image masks method for the state-of-the-art Mask R-CNN deep learning method.
Our proposed method can obtain the image masks automatically to train Mask R-CNN, and it can achieve very high classification accuracy with an over 90% mean of average precision (mAP) for segmentation.
- Score: 4.901462756978097
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes a novel automatically generating image masks method for
the state-of-the-art Mask R-CNN deep learning method. The Mask R-CNN method
achieves the best results in object detection until now, however, it is very
time-consuming and laborious to get the object Masks for training, the proposed
method is composed by a two-stage design, to automatically generating image
masks, the first stage implements a fully convolutional networks (FCN) based
segmentation network, the second stage network, a Mask R-CNN based object
detection network, which is trained on the object image masks from FCN output,
the original input image, and additional label information. Through
experimentation, our proposed method can obtain the image masks automatically
to train Mask R-CNN, and it can achieve very high classification accuracy with
an over 90% mean of average precision (mAP) for segmentation
Related papers
- ColorMAE: Exploring data-independent masking strategies in Masked AutoEncoders [53.3185750528969]
Masked AutoEncoders (MAE) have emerged as a robust self-supervised framework.
We introduce a data-independent method, termed ColorMAE, which generates different binary mask patterns by filtering random noise.
We demonstrate our strategy's superiority in downstream tasks compared to random masking.
arXiv Detail & Related papers (2024-07-17T22:04:00Z) - Variance-insensitive and Target-preserving Mask Refinement for
Interactive Image Segmentation [68.16510297109872]
Point-based interactive image segmentation can ease the burden of mask annotation in applications such as semantic segmentation and image editing.
We introduce a novel method, Variance-Insensitive and Target-Preserving Mask Refinement to enhance segmentation quality with fewer user inputs.
Experiments on GrabCut, Berkeley, SBD, and DAVIS datasets demonstrate our method's state-of-the-art performance in interactive image segmentation.
arXiv Detail & Related papers (2023-12-22T02:31:31Z) - DFormer: Diffusion-guided Transformer for Universal Image Segmentation [86.73405604947459]
The proposed DFormer views universal image segmentation task as a denoising process using a diffusion model.
At inference, our DFormer directly predicts the masks and corresponding categories from a set of randomly-generated masks.
Our DFormer outperforms the recent diffusion-based panoptic segmentation method Pix2Seq-D with a gain of 3.6% on MS COCO val 2017 set.
arXiv Detail & Related papers (2023-06-06T06:33:32Z) - DynaMask: Dynamic Mask Selection for Instance Segmentation [21.50329070835023]
We develop a Mask Switch Module (MSM) with negligible computational cost to select the most suitable mask resolution for each instance.
The proposed method, namely DynaMask, brings consistent and noticeable performance improvements over other state-of-the-arts at a moderate computation overhead.
arXiv Detail & Related papers (2023-03-14T13:01:25Z) - MP-Former: Mask-Piloted Transformer for Image Segmentation [16.620469868310288]
Mask2Former suffers from inconsistent mask predictions between decoder layers.
We propose a mask-piloted training approach, which feeds noised ground-truth masks in masked-attention and trains the model to reconstruct the original ones.
arXiv Detail & Related papers (2023-03-13T17:57:59Z) - Adversarial Masking for Self-Supervised Learning [81.25999058340997]
Masked image model (MIM) framework for self-supervised learning, ADIOS, is proposed.
It simultaneously learns a masking function and an image encoder using an adversarial objective.
It consistently improves on state-of-the-art self-supervised learning (SSL) methods on a variety of tasks and datasets.
arXiv Detail & Related papers (2022-01-31T10:23:23Z) - Image Generation with Self Pixel-wise Normalization [17.147675335268282]
Region-adaptive normalization (RAN) methods have been widely used in the generative adversarial network (GAN)-based image-to-image translation technique.
This paper presents a novel normalization method, called self pixel-wise normalization (SPN), which effectively boosts the generative performance by performing the pixel-adaptive affine transformation without the mask image.
arXiv Detail & Related papers (2022-01-26T03:14:31Z) - Self-Supervised Visual Representations Learning by Contrastive Mask
Prediction [129.25459808288025]
We propose a novel contrastive mask prediction (CMP) task for visual representation learning.
MaskCo contrasts region-level features instead of view-level features, which makes it possible to identify the positive sample without any assumptions.
We evaluate MaskCo on training datasets beyond ImageNet and compare its performance with MoCo V2.
arXiv Detail & Related papers (2021-08-18T02:50:33Z) - DCT-Mask: Discrete Cosine Transform Mask Representation for Instance
Segmentation [50.70679435176346]
We propose a new mask representation by applying the discrete cosine transform(DCT) to encode the high-resolution binary grid mask into a compact vector.
Our method, termed DCT-Mask, could be easily integrated into most pixel-based instance segmentation methods.
arXiv Detail & Related papers (2020-11-19T15:00:21Z) - Boundary-preserving Mask R-CNN [38.15409855290749]
We propose a conceptually simple yet effective Boundary-preserving Mask R-CNN (BMask R-CNN) to leverage object boundary information to improve mask localization accuracy.
BMask R-CNN contains a boundary-preserving mask head in which object boundary and mask are mutually learned via feature fusion blocks.
Without bells and whistles, BMask R-CNN outperforms Mask R-CNN by a considerable margin on the COCO dataset.
arXiv Detail & Related papers (2020-07-17T11:54:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.