BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation
- URL: http://arxiv.org/abs/2001.00309v3
- Date: Sun, 26 Apr 2020 10:27:02 GMT
- Title: BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation
- Authors: Hao Chen, Kunyang Sun, Zhi Tian, Chunhua Shen, Yongming Huang,
Youliang Yan
- Abstract summary: In this work, we achieve improved mask prediction by effectively combining instance-level information with semantic information with lower-level fine-granularity.
Our main contribution is a blender module which draws inspiration from both top-down and bottom-up instance segmentation approaches.
BlendMask can effectively predict dense per-pixel position-sensitive instance features with very few channels, and learn attention maps for each instance with merely one convolution layer.
- Score: 103.74690082121079
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Instance segmentation is one of the fundamental vision tasks. Recently, fully
convolutional instance segmentation methods have drawn much attention as they
are often simpler and more efficient than two-stage approaches like Mask R-CNN.
To date, almost all such approaches fall behind the two-stage Mask R-CNN method
in mask precision when models have similar computation complexity, leaving
great room for improvement.
In this work, we achieve improved mask prediction by effectively combining
instance-level information with semantic information with lower-level
fine-granularity. Our main contribution is a blender module which draws
inspiration from both top-down and bottom-up instance segmentation approaches.
The proposed BlendMask can effectively predict dense per-pixel
position-sensitive instance features with very few channels, and learn
attention maps for each instance with merely one convolution layer, thus being
fast in inference. BlendMask can be easily incorporated with the
state-of-the-art one-stage detection frameworks and outperforms Mask R-CNN
under the same training schedule while being 20% faster. A light-weight version
of BlendMask achieves $ 34.2% $ mAP at 25 FPS evaluated on a single 1080Ti GPU
card. Because of its simplicity and efficacy, we hope that our BlendMask could
serve as a simple yet strong baseline for a wide range of instance-wise
prediction tasks.
Code is available at https://git.io/AdelaiDet
Related papers
- DynaMask: Dynamic Mask Selection for Instance Segmentation [21.50329070835023]
We develop a Mask Switch Module (MSM) with negligible computational cost to select the most suitable mask resolution for each instance.
The proposed method, namely DynaMask, brings consistent and noticeable performance improvements over other state-of-the-arts at a moderate computation overhead.
arXiv Detail & Related papers (2023-03-14T13:01:25Z) - Mask is All You Need: Rethinking Mask R-CNN for Dense and
Arbitrary-Shaped Scene Text Detection [11.390163890611246]
Mask R-CNN is widely adopted as a strong baseline for arbitrary-shaped scene text detection and spotting.
There may exist multiple instances in one proposal, which makes it difficult for the mask head to distinguish different instances and degrades the performance.
We propose instance-aware mask learning in which the mask head learns to predict the shape of the whole instance rather than classify each pixel to text or non-text.
arXiv Detail & Related papers (2021-09-08T04:32:29Z) - BoxInst: High-Performance Instance Segmentation with Box Annotations [102.10713189544947]
We present a high-performance method that can achieve mask-level instance segmentation with only bounding-box annotations for training.
Our core idea is to exploit the loss of learning masks in instance segmentation, with no modification to the segmentation network itself.
arXiv Detail & Related papers (2020-12-03T22:27:55Z) - DCT-Mask: Discrete Cosine Transform Mask Representation for Instance
Segmentation [50.70679435176346]
We propose a new mask representation by applying the discrete cosine transform(DCT) to encode the high-resolution binary grid mask into a compact vector.
Our method, termed DCT-Mask, could be easily integrated into most pixel-based instance segmentation methods.
arXiv Detail & Related papers (2020-11-19T15:00:21Z) - SipMask: Spatial Information Preservation for Fast Image and Video
Instance Segmentation [149.242230059447]
We propose a fast single-stage instance segmentation method called SipMask.
It preserves instance-specific spatial information by separating mask prediction of an instance to different sub-regions of a detected bounding-box.
In terms of real-time capabilities, SipMask outperforms YOLACT with an absolute gain of 3.0% (mask AP) under similar settings.
arXiv Detail & Related papers (2020-07-29T12:21:00Z) - Mask Encoding for Single Shot Instance Segmentation [97.99956029224622]
We propose a simple singleshot instance segmentation framework, termed mask encoding based instance segmentation (MEInst)
Instead of predicting the two-dimensional mask directly, MEInst distills it into a compact and fixed-dimensional representation vector.
We show that the much simpler and flexible one-stage instance segmentation method, can also achieve competitive performance.
arXiv Detail & Related papers (2020-03-26T02:51:17Z) - PointINS: Point-based Instance Segmentation [117.38579097923052]
Mask representation in instance segmentation with Point-of-Interest (PoI) features is challenging because learning a high-dimensional mask feature for each instance requires a heavy computing burden.
We propose an instance-aware convolution, which decomposes this mask representation learning task into two tractable modules.
Along with instance-aware convolution, we propose PointINS, a simple and practical instance segmentation approach.
arXiv Detail & Related papers (2020-03-13T08:24:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.